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ABSTRACT 


The  recent  trend  towards  higher  levels  of  automation  in  complex  systems, 
such  as  in  nuclear  power  plants,  air-* traffic  control  and  flight  management,  is 
changing  the  role  of  the  human  operator  from  one  of  a  controller  to  one  of  a 
supervisory  decision-maker.  The  operator’s  primary  responsibility  in  this  new 
role  is  to  extract  information  from  his  environment,  and  to  integrate  it  for 
action  selection  and  its  implementation.  The  present  analytic  and  experimental 
research  has  sought  to  understand  human  monitoring,  information-processing  and 
task  selection  procedures  in  dynamic  multi-task  environments,  as  a  preliminary 
step  towards  analyzing  and  evaluating  the  human  component  of  a  supervisory 
control  system. 

A  simple  yet  realistic  computer  representation  of  the  supervisory  decision 
situation  is  developed.  The  experimental  paradigm  retains  the  essence  of  the 
multi-task  decision  problem  by  presenting  the  human  with  a  dynamic  situation 
wherein  tasks  of  different  value,  time  requirement  and  deadline  compete  for  his 
attention.  Via  this  framework,  the  effects  of  various  task  related  variables 
on  the  human  decision-processes  are  studied. 

A  normative  dynamic  decision  model  (DDM)  of  human  task  sequencing  perfor¬ 
mance  is  developed.  The  analytic  framework  of  the  DDM  is  based  on  modem 
control,  estimation  and  semi-Markov  decision  process  theories,  which  provide 
a  general  methodology  for  analyzing  dynamic  decision-making  under  uncertainty. 
Two  novel  features  of  DDM  are  its  explicit  incorporation  of  human  limitations, 
such  as  reaction  time  delays,  randomness,  limited  resolving  power  and  limited 
information-processing  capacity,  and  its  suitability  to  assimilate  new  elements 
of  the  decision  task  as  they  become  considered  and  understood.  Also,  the 
analytic  framework  of  the  DDM  has  been  shown  to  subsume  several  problems  in 
single-processor  sequencing  theory,  Markov  decision  theory  and  priority  queueing 
systems. 

In  order  to  validate  the  model,  several  time-history  and  scalar  measures 
of  performance  are  proposed.  Excellent  model-data  agreement  is  obtained  for 
all  the  experimental  conditions  studied.  Moreover,  the  model  has  been  shown  to 
represent  human  decision  behavior  significantly  better  than  several  heuristic 
sequencing  rules  of  scheduling  theory.  The  model  has  the  potential  for  use  in 
computer-aiding,  and  could  form  a  significant  step  towards  the  modeling  of  multi- 
human  behavior  in  complex,  multi-level,  multi-task  systems. 
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I.  INTRODUCTION  AND  PROBLEM  FORMULATION 


An  emerging  trend  in  man-machine  systems  appears  to  be  away  from 
manual  control  to  partial,  if  not  full,  automation.  In  this  regard,  the 
role  of  the  human  operator  is  shifting  from  one  of  a  direct  system  con¬ 
troller  to  that  of  a  monitor  of  multiple  tasks,  or  a  supervisor  of  sev¬ 
eral  semi-automated  subsystems.  The  operator’s  primary  task  in  these 
systems  is  to  extract  information  from  his  environment,  and  integrate 
this  information  for  action  selection  and  implementation.  In  this  con¬ 
text,  monitoring,  information-processing  and  dynamic  (real-time)  deci¬ 
sion-making  skills  of  the  human  operator  gain  prominence  over  his  sensor/ 
motor  skills.  In  order  to  properly  analyze  and  evaluate  the  human  com¬ 
ponent  of  a  supervisory  control  system,  an  understanding  of  the  human 
limitations  and  capabilities  as  an  information-processor  and  dynamic 
decision-maker  is  essential. 

There  are  two  feasible  paths  that  one  can  follow  to  develop  human 
operator  decision  models,  supported  by  concomitant  experimental  results, 
in  complex  supervisory  control  systems.  The  first  approach  starts  with 
one  task  (or  subsystem)  and  several  humans  to  explore  information¬ 
sharing  and  inter-human  dynamics,  and  then  adds  more  tasks  (or  subsys¬ 
tems).  The  second  approach  begins  by  studying  single  human  dynamic 
decision-making  among  multiple  tasks,  and  next  introduces  multiple 
decision-makers,  composed  of  human  and,  possibly,  non-human  decision¬ 
makers.  The  latter  route  is  advocated  in  this  effort. 

The  present  research  seeks  to  understand  human  information-process¬ 
ing  and  task  selection  procedures  in  dynamic  multi-task  environments. 
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The  approach  is  to  assimilate  the  results  of  a  joint  experimental  and 
analytic  program  into  a  normative  dynamic  decision  model  (DDM)  of  human 
task  sequencing  performance.  To  this  end,  a  general  multi-task  decision 
problem  is  considered  wherein  tasks  of  different  value,  duration  and 
deadline  compete  for  the  operator’s  attention.  This  situation  occurs 
in  targeting  selection,  air-traffic  control,  multiple  remotely  piloted 
vehicle  (m-RPV)  control,  process  control,  power  system  regulation,  pro¬ 
duction  scheduling,  as  well  as  in  many  other  supervisory  control  sys¬ 
tems.  The  model  that  has  emerged  may  be  viewed  as  a  basic  building  block 
in  the  comprehensive  understanding  of  decision-making  procedures,  an 
understanding  that  could  facilitate  the  modeling  of  multi-human  behavior 
in  complex,  multi-level,  multi-task  systems. 

1. 1  Multi-task  Decision  Problem 

We  believe  that  a  complete  theory  of  human  behavior  in  multi-task 
systems,  analogous  to  Edwards’  classification  of  human  response  theory 
[1],  should  consist  of  three  parts:  (i)  a  theory  of  how  potential  tasks 
are  identified  for  consideration;  (ii)  a  theory  of  the  process  of  con¬ 
sideration  by  which  all  tasks  but  one  are  eliminated;  and  (iii)  a  theory 
about  how  the  chosen  task  is  executed.  The  last  topic  involves  the 
study  of  human  implementation  skills,  which  are  of  secondary  importance 
in  supervisory  control  situations.  The  first  topic,  that  of  identifying 
potential  tasks  for  consideration,  is  the  problem  of  creative  thinking, 
of  which  little  of  significance  is  known  at  present.  However,  this  may 
not  be  restrictive  in  most  multi-task  systems  of  the  type  discussed 
above.  In  these  systems,  the  tasks  are  immediately  identified,  e.g., 
once  a  target  is  detected.  The  topic  of  selecting  a  task  for  action 
from  amongst  many  candidate  tasks  involves  monitoring,  information- 
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processing  and  dynamic  (real-time)  decision-making,  and  is  the  problem 
of  interest  here. 

Fig.  1  shows  the  fundamental  decision-loop  that  is  addressed  in 
this  work.  The  human  decision-process  involves  1)  whether  to  process  a 
task  or  gather  more  information  (i.e.,  monitor);  and  2)  which  of  N  tasks 
(N  is  time-varying)  to  act  upon  in  order  to  maximize  the  system  perfor¬ 
mance  (e.g.,  maximize  reward,  minimize  regret,  etc.).  The  decision-loop 
is  dynamic  in  nature.  As  time  evolves,  tasks  of  different  value,  dura¬ 
tion  (processing  time)  and  opportunity  window  (deadline)  demand  human’s 
attention,  while  others  depart.  The  opportunity  windows  shrink  with 
time  as  the  tasks  approach  their  deadlines. 

In  the  following,  we  provide  a  taxonomy  for  behavioral  decision 
theory  and  show  that  the  multi-task  decision  problem  (MTDP)  belongs  to 
the  most  general  class  of  decision-processes  studied  to  date,  viz.,  the 
semi-Markov  decision  processes  (SMDP) .  We  also  summarize  the  results  of 
a  major  literature  survey  on  behavioral  decision  theory  [2],  and  criti¬ 
cally  evaluate  the  previous  (albiet  limited)  research  on  multi-task  deci¬ 
sion-making,  in  order  to  put  the  nature  of  the  present  work  in  perspec¬ 
tive. 

1 . 2  A  Taxomony  for  Behavioral  Decision  Theory 

A  decision-maker’s  (DMrs)  choice  in  any  decision  task  is  a  conse¬ 
quence  of  what  he  can  do,  what  he  knows  and  what  he  wants  [3].  "What  he 
can  do”  represents  the  alternatives  (possible  responses)  available  to  the 
DM.  "What  he  knows"  refers  to  the  information  that  DM  has  of  the  deci¬ 
sion  situation.  This  can  range  from  the  deterministic  situations  where 
all  the  relevant  variables  of  the  decision  process  are  known,  to  the 
highly  probabilistic  situations  where  little  information  is  available 


FIG  1:  DYNAMIC  MONITORING/DECISION  LOOP  FOR  A  SINGLE  OPERATOR  IN  A 
- —  MULTI-TASK  ENVIRONMENT 


about  any  variable  of  interest.  Finally,  "what  he  wants"  pertains  to 
the  DM's  perception  of  the  task  objectives  and  his  preferences  for  the 
various  outcomes  of  a  decision.  These  three  concepts  are  fundamental  to 
every  decision-making  process* 

Most  theories  of  individual  choice  behavior  can  be  conveniently 
dichotomized  into  two  distinct  classes  depending  on  the  nature  of  the 
decision  task,  viz.,  single-stage  and  multi-stage  decision  theories.  A 
detailed  classification  of  individual  choice  theories  is  shown  in 
Fig.  2  and  is  clarified  below. 

I  INDIVIDUAL  CHOICE  THEORY  I 


SINGLE-STAGE  DECISION  THEORY 

lE,V,r) 

MULTI-STAGE  DECISION  THEORY 

IS,E,V,T,  r) 

_ i 

DYNAMIC  DECISION  THEORY 

(controlled  decision  process) 

SEQUENTIAL  DECISION  THEORY 

(uncontrolled  decision  process) 

i _ _ _ 

_ _ i 

— . * - - - - , 

MARKOV  DECISION  THEORY 

(stage  duration  is  deterministic 
or  irrelevant) 


SEMI-MARKOV  OR  MARKOV  RENEWAL 
DECISION  THEORY 
(stage  duration  is  random  with 
known  distribution) 


Legend :  S  =  set  of  states  of  the  system 
£  =  set  of  events 
P  *  set  of  possible  actions 
£  =  transformation  rule 

r  =  reward  function 

FIG.  2;  A  CLASSIFICATION  OF  INDIVIDUAL  CHOICE  THEORIES 

1.2.1  Single-Stage  Decision  Theory 

A  single-stage  or  static  decision  process  may  be  represented  as 
in  Fig.  3. 
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FIG.  3:  FLOW  DIAGRAM  OF  A  SINGLE-STAGE  DECISION  PROCESS 

We  see  that  a  static  decision  process  can  be  conveniently  characterized 
by  the  triple  (E,P,r)  where 

E  =  {e }  =  a  finite  non-empty  set  of  external  events  (also  known  as 
states  of  nature,  stimuli,  hypotheses  or  diagnoses) 

V  =  {d }  =  a  finite,  non-empty  set  of  possible  decisions  representing 
"What  he  can  do"  (also  commonly  referred  to  as  alterna¬ 
tives,  responses,  or  actions). 

r  -r(e,d)=a  reward  (return)  uniquely  associated  with  the  combined 

occurance  of  event,  e,  and  decision,  d. 

Single-stage  decision-making  problems  can  be  further  classified  into  two 

categories  depending  on  the  information  that  the  DM  possesses  (i.e., 

"What  he  knows")  about  E.  These  are  decisions  with  certainty  (riskless) 

and  decisions  with  uncertainty  (under  risk).  In  the  former  category, 

each  decision  guarantees  a  reward  with  certainty,  i.e.,  E  is  completely 

known.  In  the  latter  category,  only  a  probability  can  be  assigned  to 

each  ecE  such  that  ^ ^  p(e)  *  1. 

eeE 


The  mechanics  of  a  static  decision  problem  are  as  follows:  the  DM 
chooses  and  executes  a  decision,  d;  an  event,  e,  occurs;  he  receives  a 


7 


reward,  r(e,d),  determined  by  the  joint  occurance  of  the  event,  a,  and 
decision,  d;  and  his  decisions  are  mutually  independent,  l.e.,  he  never 
makes  another  decision  based  on  whatever  he  may  have  learned.  It  Is 
frequently  assumed  that  the  DM  chooses  his  decision  to  maximize  the 

expected  reward  to  minimize  regret  (l.e.,  "what  he  wants").  The  widely 

* 

studied  single-choice  gambling  paradigms  are  examples  of  single-stage 
decision  tasks. 

1.2.2  Multi-Stage  Decision  Theory 

In  single-stage  decision-making,  the  DM  must  make  a  single  choice 
from  among  a  number  of  alternatives.  But  in  most  man-machine  and  organ¬ 
isational  systems,  the  DM  seldom  makes  a  single  Isolated  decision.  These 
situations  require  that  the  DM  evaluate  a  number  of  objects  or  hypotheses 
simultaneously  as  the  evidence  accumulates  sequentially  and/or  that  he 
make  several  interdependent  decisions.  Thus,  an  understanding  of  human 
behavior  in  multi-stage  decision-processes  is  fundamental  to  modeling 
human  behavior  in  dynamic  and  uncertain  environments. 

In  a  multi-stage  decision  process,  the  DM  makes  a  sequence  of  deci¬ 
sions.  These  types  of  processes  consist  of  a  series  of  stages  such  that 
the  output  of  one  stage  becomes  the  input  to  the  succeeding  stage.  Fig. 

4  is  representative  of  a  multi-stage  decision  process  [4]. 


FIG.  4:  FLOW  DIAGRAM  OF  AN  N-STAGE  DECISION  PROCESS 
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Referring  to  Fig.  4,  a  multi-stage  decision  process  can  be  characterized 

by  the  pentad  (S,V,E,T, r)  where 

S  ■  {s.}  •  set  of  states  of  the  system 
1 

V  ■  {d^}  ■  set  of  possible  decisions 
E  ■  {e^}  ■  set  of  events 

T  -  {t^}  ■  set  of  transformation  rules  (laws  of  motion  or  transition 
functions)  that  describe  the  changes  in  state  at  each 
stage  i 

r  ■  {r^}  -  set  of  rewards  associated  with  each  state  transition 
The  stage-to-stage  state  transition  is  governed  by  the  transformation 
rule 

ll+l  ”  (§Li*  %♦  (1.1) 

The  reward  at  stage  i  is 

ri  “  ri  (V  W  "  *i  (V  V  (1*2) 

The  DMfs  information  can  range  from  complete  knowledge  of  the  event  set, 
(e^l ,  and  the  set  of  transformation  rules,  (t^,  to  little  or  no  know¬ 
ledge  of  these  variables.  Notice  that  the  transformation  rule,  t^,  and 
the  reward, r^,  can  be  stage  dependent  (i.e.,  non-stationary) .  It  is 
commonly  assumed  that  the  DM  chooses  his  decisions  to  maximize  his 
expected  reward  over  N  stages.  The  horizon  N  may  or  may  not  be  known  to 
the  DM. 

In  studying  multi-stage  decision  processes,  a  distinction  is  often 

maintained  between  sequential  and  dynamic  decision  processes  (see  Fig. 

2).  In  sequential  decision  problems,  the  evolution  of  the  state  of  the 

systems,  is  independent  of  the  DM’s  decisions.  That  is,  Eq.  (1.1) 

— i 

becomes 


~L+1  "  ^ 


e, ) 


(1.3) 


Thus,  a  sequential  decision  task  is  an  uncontrolled  decision  process. 

It  consists  of  a  sequence  of  static  decision  problems  repeated  periodi¬ 
cally  and  independently.  The  information  gained  from  earlier  decisions 
is  useful  in  making  later  decisions,  but  the  earlier  decisions  do  not 
affect  the  transformation  rule,  _t^.  The  operation  of  a  sequential 
decision  process  is  as  follows:  given  that  the  system  is  in  state  at 
the  beginning  of  a  stage  i,  the  DM  makes  a  decision,  ji^,  the  system  moves 
to  state  (which  may  or  may  not  be  identical  to  s^)  according  to  the 

transformation  rule,  and  the  DM  receives  a  reward  r^(s^,  -^d+1^  asso“ 

dated  with  this  transition.  Examples  of  sequential  decision  tasks  are 
system  failure  detection,  revision  of  opinion,  display  monitoring,  asset 
selling  and  optional  stopping. 

The  dynamic  decision  processes  are  multi-stage  decision  tasks  in 
which  the  stage-to-stage  changes  in  the  state  of  the  system  are  directly 
affected  by  the  DM's  previous  decisions,  as  well  as  by  environmental 
factors  (events)  over  which  the  DM  exercises  no  control  (see  Eq.  (1.1)), 
i.e.,  it  is  a  controlled  decision  process.  The  set  of  alternatives  and 
the  information  available  at  later  stages  are  contingent  upon  earlier 
decisions.  Thus,  the  DM  has  to  consider  the  effect  of  each  of  his 
decisions  on  the  future  states  of  the  system  and,  consequently,  on  his 
future  decisions.  The  dynamic  decision  processes  can  be  further  classi¬ 
fied  into  two  categories,  viz.,  Markovian  and  semi-Markovian  (see  Fig. 

2).  The  Markov  decision  process  (MDP)  has  the  property  that  the  stages 
are  of  deterministic  duration,  or  their  duration  is  irrelevant  to  the 
decision  problem.  Multi-stage  betting  games,  inventory  control,  search 
theory  and  resource  allocation  are  examples  of  MDP. 

The  semi-Markov  decision  process  (SMDP),  or  Markov  renewal  decision 
process,  is  characterized  by  the  fact  that  the  time  between  state  tran- 
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sitions  is  a  random  variable.  The  decision  epochs  in  a  stationary  SMDP 

are  the  times  of  state  transitions.  At  a  decision  epoch  i,  the  system  is 

in  state  s, .  The  DM  chooses  a  feasible  decision,  d, ;  the  system  moves  to 
— 1  — i 

state  s,,-  after  a  random  holding  time,  T  ,  according  to  the  transforms- 
“1' +1  i 

tion  rule;  and  the  DM  receives  a  reward  r(s^,  d^9  T^,  s^+^) ,  associated 
with  this  transition.  The  process  continues  for  finite  or  infinite  time. 

A  complete  characterization  of  a  semi-Markov  decision  process  includes 
the  hexad  (S9V9E9T9H9 r)  where  S9V9 E,T  and  r  are  as  defined  earlier,  and 
H  is  a  holding  time  function  that  determines  how  long  the  system  stays  in 
a  given  state  before  making  a  transition  to  another  specified  state.  The 
process  descriptors  (S,P, E,T,H, r)  can  be  time  dependent.  The  non-stationa- 
rity  of  the  decision  process  can  enter  either  in  the  form  of  time  depend¬ 
ent  dimension  of  the  spaces  (S,P,E),  or  in  the  form  of  time  varying  nature 
of  the  transformation  rules,  T;  the  holding  time  functions  H;  and  the 
reward  structure,  r.  If  a  process  is  a  non-st at  ionary  SMDP,  the  nota¬ 
tion  (S(t),  P(t),  E(t),  T(t),  H(t),  r(t))  is  employed  to  emphasize  its 
time  dependence.  Here,  the  decisions  are,  in  general,  continuous  func¬ 
tions  of  time.  Some  examples  of  SMDP  are  targeting  selection,  aiv-traffic 
control,  multi-RPV  control,  industrial  process  control,  power  system 
regulation  and  many  other  multi-task  systems.  The  analysis  of  these 
systems  is  arduous,  in  view  of  the  non-stationarity  of  the  underlying 
SMDP.  Virtually  no  significant  research  has  been  done  by  behavioral 
decision  theorists  using  semi-Markov  decision  paradigms. 

1 . 3  Summary  of  Research  on  Behavioral  Decision  Theory  [2] 

A  brief  and  selective  overview  of  the  theories  of  individual  choice 
behavior  in  static  and  multi-stage  decision  tasks  was  provided  earlier 
in  (2).  The  primary  purpose  of  this  review  was  to  investigate  the  appli¬ 
cability  of  this  body  of  knowledge  to  model  human  information-processing 
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and  decision-making  skills  in  multi-task  systems.  The  main  conclusion 
was  that  the  multi-tadk  decision  problems  are  more  general  than  any 
considered  in  behavioral  decision  theory  to  date.  However,  there  exist 
bits  and  pieces  of  relevant  models  and  a  wide  range  of  experimental  liter¬ 
ature  that  may  be  useful  in  modeling  human  behavior  in  multi-task  systems. 
Specifically,  the  following  observations  of  the  review  are  relevant  to 
our  discussion.  The  reader  is  referred  to  [2]  for  additional  details. 
1.3.1  Single-stage  Decision-making 

Most  of  the  literature  on  behavioral  decision  theory  is  devoted  to 
single-stage  (static)  decision-making  under  risk.  The  models  of  risky 
decision  behavior  may  be  characterized  by  two  alternative  descriptions 
of  the  decision  task.  The  first  modeling  approach,  rooted  in  mathematics 
and  economics,  describes  the  decision  task  in  terms  of  probability  dis¬ 
tributions  over  sets  of  outcomes  (events)  with  little  or  no  attention 
paid  to  the  underlying  psychological  processes  of  the  individual  DM. 

This  approach  led  to  such  moment-based  models  as  the  Expected  Value  (EV) , 
the  Expected  Utility  (EU) ,  the  Subjectively  Expected  Utility  (SEU) ,  and 
the  Risk  Preference  models.  The  second  modeling  approach,  rooted  mainly 
in  psychology,  characterizes  decision  tasks  in  terras  of  multi-dimensional 
stimuli.  It  assumes  that  each  stimulus  forms  a  basic  risk  dimension,  and 
that  the  DM  integrates  these  dimensions  into  a  judgement  or  decision. 

Thus,  this  approach  led  to  explanatory  models  that  view  decision-making 
under  risk  as  a  form  of  information-processing  behavior. 

The  dominant  moment  based  model  for  single-stage  decision-making  is 
the  subjectively  expected  utility  (SEU)  model  proposed  by  Edwards  [5]. 

In  this  model,  the  DM  is  assumed  to  maximize  the  subjectively  expected 
utility  of  an  alternative,  d,  given  by 


12 


SEU(d)  ps(e)  U[r(e,d)]  (1.4) 

eeE 

where  pg(e)  is  the  subjective  (perceived)  probability  of  the  event,  e; 
and  U[r(e,d)]  is  the  subjective  value  (utility)  function  of  the  event,  e. 

In  assessing  the  potential  application  of  moment  related  versus 
multi-dimensional  stimuli  models  to  static  decision-making  under  risk,  the 
following  observation  was  made  in  [2J:  for  normative  (predictive)  pur¬ 
poses,  models  based  on  moments  can  serve  as  a  first  approximation  or  as  a 
formal  standard  against  which  to  compare  actual  performance. 

1.3,2  Multi-Stage  Decision-Making 

The  existing  literature  on  multi-stage  decision-making  problems 
may  be  grouped  under  three  headings:  sequential  statistical  inference, 
optional  stopping  and  dynamic  decision-making.  The  topic  of  statistical 
inference  is  concerned  with  the  information-processing  (diagnostic) 
ability  of  the  humans,  i.e.,  the  human’s  ability  to  assess  and  revise 
probabilities.  The  optional  stopping  problem  combines  in format ion- pro¬ 
cessing  with  simple  (usually  binary)  action  selection.  Finally,  the 
existing  literature  on  dynamic  decision-making  is  mainly  concerned  with 
action  (control)  selection  with  very  little  or  no  consideration  to  the 
aspect  of  information-processing.  It  should  be  emphasized  that  virtually 
no  significant  research  has  been  done  by  behavioral  decision  theorists 
using  real-time  decision  paradigms. 

The  literature  in  the  area  of  sequential  probability  inference 
shows  two  different  approaches  to  the  modeling  problem.  The  first 
approach,  advanced  by  statisticians  and  psychologists,  employs  Bayes’ 
rule  as  a  normative  representation  of  how  a  DM  should  revise  his  probabi¬ 
lity  estimates  in  light  of  new  information.  This  approach  led  to  the 
b.udy  of  ’’conservatism"  -  a  suboptimal  human  behavior  that  produces 
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posterior  probabilities  nearer  to  the  prior  probabilities  than  those 
specified  by  Bayes1  rule.  The  second  approach,  proposed  mainly  by  psy¬ 
chologists,  argues  that  the  human  is  a  selective,  sequential  information- 
processor  with  limited  capacity  and  that  this  leads  him  to  apply  simple 
heuristics  and  cognitive  strategies.  This  approach  led  to  the  discovery 
of  such  judgemental  heuristics  as  representativeness,  availability,  and 
adjustment  and  anchoring,  which  were  found  to  determine  probabilistic 
Inferences  in  many  tasks.  However,  these  findings  can  only  be  described 
in  qualitative  terms  and,  as  yet,  no  quantitative  descriptive  theory  based 
on  heuristics  has  emerged. 

The  optional  stopping  problem  is  related  to  information-seeking 
ability  of  the  human.  In  this  problem,  the  DM  is  provided  with  an  option, 
at  each  stage  of  the  process,  to  seek  (purchase,  sample)  one  more  obser¬ 
vation,  or  to  stop  and  make  the  terminal  decision.  Virtually  all  the 
models  of  optional  stopping  are  normative  in  construct.  They  were 
developed  within  the  Bayesian  framework  using  the  subjectively  expected 
loss  of  the  sequential  decision  process  as  the  minimizing  criterion  of 
performance.  In  nK>del-data  comparisons,  It  was  found  that  all  the  rele¬ 
vant  procedural  variables  (e. g„  pay-offs,  prior  probabilities,  etc.) 
strongly  influenced  the  number  of  observations,  but  not  as  much  as  the 
normative  model  predicted.  It  was  also  found  that  the  optimal  expected 
loss  was  quite  insensitive  to  large  deviations  in  the  optimal  decision 
policy  ("curse  of  insensitivity") . 

The  dynamic  decision-making  problems  have  not  been  studied  as 
extensively  as  the  static  or  sequential  decision-making  problems.  This 
is  due,  mainly,  to  their  inherent  complexity,  analytic  sophistication 
**nd  difficulties  in  implementing  experiments  on  a  computer.  Most  of  the 
dynamic  decision  paradigms  considered  to  date  are  taken  from  other  fields 
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such  as  economics  and  operations  research.  Typically ,  the  modeling 
approach  begins  with  a  normative  construct  based  on  dynamic  programming, 
and  then  includes  human  limitations  and  constraints  to  produce  normative- 
descriptive  models.  A  common  approach  to  the  derivation  of  a  normative- 
descriptive  model  is  to  first  compare  observed  behavior  with  that  pre¬ 
scribed  by  the  normative  (truly  optimal)  model.  The  discrepancies  are 
then  interpreted  either  in  terms  of  limitations  on  the  information-pro- 
cessing  capacity  or  the  human’s  misperception  of  the  task.  The  limita¬ 
tion  on  the  information-processing  capacity  can  be  linked  to  the  DM’s 
finite  memory,  his  limited  ability  to  project  the  effects  of  his  present 
decisions  into  the  future,  his  limited  attention  span,  loss  of  decision 
time,  misaggregat ion  of  data,  etc*  The  limitation  due  to  misperception 
of  the  task  can  be  handled  by  postulating  non-isomorphic  internal  models 
and  differing  subjective  and  objective  cost  functionals.  The  optimal 
decision  policy  is  obtained  under  these  cognitive  and  perceptual  con¬ 
straints,  and  then  compared  with  the  actual  behavior.  However,  at  pre¬ 
sent  there  does  not  exist  a  systematic  method  of  identifying  the  human 
limitations  beyond  the  current  psychological  knowledge.  Moreover,  the 
dynamic  decision-making  models,  like  those  of  optional  stopping,  are 
plagued  with  the  "curse  of  insensitivity”,  i.e.,  optimal  expected  loss  is 
insensitive  to  large  deviations  in  the  optimal  decision  strategy. 

In  assessing  the  potential  application  of  the  existing  behavioral 
decision  models  to  the  MTDP,  we  conclude  that  none  of  them  address  the 
real-time  decision-making  issue  of  the  MTDP.  However,  there  exists  a 
rich  experimental  literature  which  can  provide  insights  and  ideas  into 
the  nature  of  human  limitations  in  information-processing  and  decision¬ 
making  contexts.  These  issues  are  explored  in  section  1.6. 


1.4  Multi-Task  Deeds ion-Makini 


Sheridan's  work  on  the  optimal  allocation  of  personal  presence  [6] 
might  be  thought  of  as  a  preliminary  step  towards  human  modeling  in  a 
multi-task  context.  In  this  work,  Sheridan  was  concerned  with  the  dyna¬ 
mic  human  choice  between  two  alternatives,  viz. ,  direct  presence  by  trans¬ 
porting  himself  from  one  location  to  another,  or  vicarious  presence  via 
communication.  He  employed  a  dynamic  programming  formulation  to  obtain 
optimal  decisions  over  the  planning  horizon,  with  states  being  the  loca¬ 
tions  to  be  considered. 

Rouse  and  Greens tein  [7]  pose  the  multi-task  decision  problem  in 
terms  of  event  detection  and  attention  allocations.  They  considered  a 
multi-task  paradigm  in  which  the  subjects  are  presented  with  the  process 
histories  of  several  dynamic  systems,  and  are  instructed  to  detect  process 
failures  and  react  to  them  as  quickly  as  possible.  Rouse  and  Greenstein 
model  human  event  (failure)  detection  by  generating  conditional  proba¬ 
bilities  of  event  occurrences,  given  the  observation  set,  via  discriminant 
analysis.  The  attention  allocation  problem  was  formulated  in  the  frame¬ 
work  of  a  single  server  queueing  model  with  the  object  of  minimizing  the 
weighted  expected  waiting  time,  i.e.,  unlike  the  multi-task  decision 
paradigm  of  our  work,  the  tasks,  in  Rouse  and  Greenstein* s  study,  stay  in 
the  queue  until  they  are  acted  upon  by  the  DM.  They  note  the  application 
of  the  model  to  computer- aiding,  but  the  theoretical  as  well  as  experi¬ 
mental  results  are  inconclusive. 

Tulga  [8]  formulated  the  multi-task  decision  problem  in  the  frame¬ 
work  of  a  dynamic,  deterministic,  single  machine-sequencing  model.  In 
Tulga* s  paradigm,  the  tasks  are  represented  by  rectangles  of  varying 
height  (value  density)  and  width  (task  duration,  processing  time). 

Tasks  appear  randomly  in  time  and  position  and  move  at  a  constant  velo- 
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city  towards  a  dead-line.  The  subjects  task  is  to  attend  to  one  task 
at  a  time  and  thus  cause  that  tasks*  width  to  collapse  uniformly  and,  one 
hopes,  to  disappear  before  the  task  reaches  the  dead-line.  The  reward 
earned  is  the  aggregate  reduction  in  the  areas  of  all  tasks.  Assuming 
stationary  task  parameters,  open-loop  feedback  optimal  (OLFO)  decision 
policy  was  obtained  by  solving  a  deterministic  optimization  problem  every 
time  a  new  or  expected  task  arrives,  and  every  time  a  task  is  completed. 
Dynamic  programming  with  branch  and  bound  strategies  was  employed  to 
solve  the  resulting  optimization  problem. 

The  studies  of  Tulga,  and  of  Rouse  and  Greenstein  are  particularly 
germane  to  the  present  research  as  they  exemplify  two  of  the  most  popular 
modeling  approaches  to  the  multi-task  decision  problem  (MTDP) ,  viz., 
sequencing  (combinatorial)  and  queueing-theoretic  approaches.  In  section 
1.6,  we  address  at  some  of  the  limitations  of  these  two  approaches  to  the 
MTDP  and  indicate  how  we  have  overcome  their  shortcomings  via  a  semi- 
Markov  decision  process  (SMDP)  approach. 

1. 5  Experimental  Paradigm 

The  primary  focus  of  this  research  effort  is  on  human  information¬ 
processing  and  dynamic  decision-making  behavior  in  multi-task  situations. 
In  order  to  minimize  extraneous  complexities,  such  as  intricate  task 
structure,  resource  constraints,  etc.,  we  have  considered  a  simple,  yet 
realistic,  computer  controlled  experimental  set-up  shown  in  Fig.  5. 

This  experimental  paradigm  is  a  modified  version  of  the  one  used  by  Tulga 
[8].  In  the  experiments,  the  subjects  observe  a  CRT  screen  on  which 
multiple,  concomitant  tasks  are  represented  by  moving  rectangualr  bars. 

The  bars  appear  at  the  left  edge  of  the  screen  and  move  at  different 
velocities  to  the  right,  disappearing  upon  reaching  the  right  edge. 

Thus,  the  screen  width  represents  an  "opportunity  window**.  In  the  pre- 


subjects'  response  box 


Fig.  5:  EXPERIMENTAL  APPARATUS 


sent  experimental  paradigm,  there  can  be,  at  most,  a  total  of  five  tasks 
on  the  CRT  screen,  with  a  maximum  of  one  on  each  line.  This  number  is 
commensurate  with  the  results  of  Miller  [9]  on  the  limitations  of  human 
information-processing  capacity. 

The  height  (reward,  value)  of  each  bar  is  either  one,  two  or  three 
units.  The  number  of  dots  (l_5m<5)  displayed  on  a  bar  represents  the  time 
(in  seconds)  required  to  process  the  task.  The  subject  may  process  a 
task  by  holding  down  the  appropriate  push-button  as  in  Fig.  5.  By  pro¬ 
cessing  a  task  successfully,  the  subject  is  credited  with  the  correspond¬ 
ing  reward  (r  <  l,  2  or  3),  and  the  completed  task  is  eliminated  from 
the  screen.  However,  no  partial  credit  is  given. 


The  above  experimental  framework  retains  the  essential  features  of 
the  multi-task  decision  problem  in  a  manageable,  yet  manipulative,  con- 
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text.  Using  this  formulation,  the  effects  of  key  task  variables  on 
human  decision-processes  are  studied  via  the  following  five  experimental 
conditions : 


(i) 

Condition  A: 

Equal  task  velocities. 

(ii) 

Condition  B : 

Fixed  rewards  of  3  units 

for  each  task. 

(iii) 

Condition  C: 

Equal  processing  times  of 

3  sec.  for  each  task 

(iv) 

Condition  D: 

Full  blown,  where  none  of 

the  variables  is 

fixed. 


(v)  Condition  B  :  Similar  to  condition  B,  but  parallel  monitoring 

y 

is  denied. 

In  condition  B^,  the  images  of  all  the  bars,  except  the  one  being 
processed,  are  blanked  from  the  CRT  screen.  This  prevents  subjects  from 
monitoring  other  tasks,  and,  perhaps,  deciding  on  the  next  task  to  be 
acted  upon.  Thus,  the  subjects  are  forced  to  act  in  a  serial  mode  under 
this  experimental  condition. 

Six  subjects,  all  university  of  Connecticut  graduate  Engineering 
students,  were  well-trained  on  the  experimental  paradigm.  The  relation¬ 
ships  among  the  tasks’  velocities  and  processing  times  were  carefully 
chosen  as  to  preclude  a  perfect  score,  and  to  motivate  the  subjects  to 
use  a  rational  sequencing  algorithm.  In  all  cases,  the  subjects  were 
instructed  to  maximize  the  accumulated  reward,  and  were  scored  using  the 
total  score,  as  well  as  the  percentage  of  a  perfect  score.  They  were 
informed  of  their  score  following  each  90  sec.  run  and  were  encouiaged 
to  keep  it  as  high  as  possible. 

In  the  data-taking  runs  each  subject  was  presented  with  eight  repli¬ 
cations  of  each  experimental  condition,  in  randomized  order.  This  was 
achieved  via  a  "scrambling  technique"  that  switched  tasks  among  the  five 


parallel  lines  for  different  runs  [10].  The  tasks  were  unscrambled  at 


the  time  of  data  analysis.  This  type  of  experimental  design,  when  aggre¬ 
gated  across  subjects,  yields  ensemble  statistics  that  are  indicative  of 
the  subjects*  population.  The  source  of  randomness  in  this  design  is  the 
inter-subject  variability.  This  type  of  design  has  the  added  advantage 
of  minimizing  artifacts  such  as  the  effects  of  learning. 

The  data  collected  were  time-histories  for  each  line  i  of  the  sub¬ 
ject's  decisions,  d^(t);  the  task  completion  status,  c^(t);  and  the  error 
sequence,  e^(t).  The  variables  d^(t),  c^(t)  and  e^(t)  are  binary  numbers 
defined  by 

j  1  if  a  subject  was  processing  a  task  on  line  i  at  time  t 
di(t)  "  ) 

'  0  otherwise  (1.5a) 

ci(t)  53  |  1  if  a  subject  had  completed  a  task  on  line  i  by  time  t 

(o  otherwise  (1.5b) 

and  (  1  if  a  subject  was  processing  a  task  on  line  i  at  time  t, 

e i ( t )  =  \  which  can  not  be  successfully  completed 

v  0  otherwise  (1.5c) 

In  Eq.  (1.5a),  i=0  refers  to  the  "do  nothing"  or  monitoring  decision. 

The  variable  c^(t)  is  set  to  zero  at  the  end  of  the  opportunity  window 
of  the  present  task,  before  the  arrival  of  the  next  task  In  the  sequence. 
At  a  sampling  rate  of  20/sec.,  each  run  yielded  1800  datum  points  for 
each  of  the  variables  recorded.  For  the  same  experimental  condition, 
the  time-histories  were  ensemble  averaged  to  obtain  the  decision  proba¬ 
bilities,  P^(t)  ;  completion  probabilities,  P^(t);  and  error  probabili¬ 
ties,  Pe^(t).  The  averaging  process  was  first  done  for  each  subject, 
and  then  across  subjects  to  obtain  the  "grand"  averages.  The  details  of 
data  analysis  are  presented  in  section  3,1. 

1.6  SUMMARY 

In  previous  sections,  we  have  examined  the  relevant  literature  on 
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behavioral  decision  theory  and  multi-task  decision-making.  This  overview 
has  suggested  several  limitations  of  the  previous  work  and  possible  means 
to  overcome  them.  The  following  conclusions  and  comments  in  this  regard 
seem  appropriate. 

(i)  Status  of  behavioral  decision  theory :  Most  of  the  litera¬ 
ture  on  behavioral  decision  theory  is  devoted  to  single- 
stage  decision-making.  The  existing  literature  on  multi-stage 
decision-making  emphasizes  either  information-processing 
(diagnosis)  or  action  selection.  However,  any  realistic 
multi-task  system  involves  diagnosis  as  well  as  dynamic 
(usually  real-time)  action  selection. 

(ii )  Normative  versus  Descriptive  models:  Theories  of  rational 
behavior  may  be  normative  or  descriptive.  The  normative 
theory  attempts  to  prescribe  how  decisions  should  be  made  in 
the  face  of  a  given  situation.  The  descriptive  theory,  on 
the  other  hand,  purports  to  explain  how  decisions  are  made 
in  a  given  situation.  A  review  of  behavioral  decision 
theory  [2]  shows  that  normative  (prescriptive)  models  can 
serve  as  a  first  approximation  to  assess  human  decision  be¬ 
havior,  or  they  can  be  used  as  a  formal  standard  against 
which  to  compare  actual  performance.  The  model  developed  in 
this  thesis  is  normative  in  construct. 

(Hi)  Need  for  good  Multi-task  paradigm :  Experiments  in  multi¬ 
task  decision-making  may,  by  their  very  nature,  become  overly 
elaborate  and  cumbersome.  This  is  especially  true  when  the 
experimenter  yields  to  the  natural  temptation  to  simulate 
the  "entire  scenario",  thereby  possibly  masking  trends  in  the 
resulting  data.  In  summarizing  the  research  on  behavioral 
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decision  theory,  we  noted  that  the  discrepancies  between  a 
normative  model  and  observed  behavior  can  be  attributed  to 
cognitive  (intellectual  or  information-processing)  limita¬ 
tions,  misperception  of  the  task  and  procedural  variables. 
Since  there  exists  no  systematic  method  of  identifying  the 
human  limitations  beyond  current  psychological  knowledge, 
the  multi-task  supervisory  control  decision  paradigms  should 
be  designed  to  minimize  the  limitations  due  to  misperception 
of  the  task  and  procedural  variables.  Such  an  experimental 
paradigm  was  developed  in  section  1.5.  This  paradigm  is 
simple,  realistic,  easy  to  understand  and  to  administer. 

It  retains  the  essence  of  the  multi-task  decision  problem 
by  presenting  the  human  with  a  dynamic  situation  wherein 
tasks  of  different  value,  time  requirement  and  deadline 
compete  for  his  attention.  Due  to  its  simplicity,  the 
paradigm  minimizes  the  possibility  of  human  misperception 
of  the  tasks.  If  we  can  understand  and  model  the  behavior 
of  well-trained  subjects  in  simple  laboratory  tasks,  then 
perhaps  this  knowledge  may  be  extended  to  more  complex 
tasks.  The  ability  to  repeat  laboratory  experiments  is  a 
powerful  tool,  for  it  allows  us  to  study  intersubject 
differences,  the  effects  of  different  information,  and 
provides  us  with  a  measure  of  variability  inherent  in 
human’s  decision  process. 

(iv)  Curse  of  insensitivity:  Most  normative  decision  models  of 
behavioral  research  are  plagued  with  the  "curse  of  insensi¬ 
tivity":  substantial  variations  in  the  optimal  decision 

policies  lead  to  only  a  small  change  in  the  resulting 
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cost.  This  problem  could  have  been  minimized,  to  some  extent, 
by  the  proper  choice  of  reward  and  processing  time  structures, 
as  the  discrete  format  employed  in  the  present  experimental 
paradigm. 

(v)  Modeling  approaches:  Queueing  and  sequencing  (combinatorial) 
theoretic  approaches  [7,8]  appear  to  be  the  most  popular 
modeling  approaches  to  model  human  decision  strategy  in  a 
KTDP.  The  main  shortcoming  of  classical  queueing  theory 
approach  is  that  it  is  extremely  difficult,  if  not  impossible, 
to  determine  the  structure  of  an  optimal  strategy  in  the  MTDP, 

as  it  involves  a  dynamic,  endogeneous,  preempt -re peat 

f 

priority  discipline  with  non-conservative  customer  (task) 
and  server  (human)  characteristics.  The  main  advantage  of 
this  approach  is  that  it  can  handle  stochastic  arrivals 
(which  are  assumed  to  occur  indefinitely  into  the  future), 
and  stochastic  processing  times.  That  is,  the  approach  can 
incorporate  uncertainty  in  the  task  characteristics.  How¬ 
ever,  in  many  practical  applications  the  task  characteristics 
are  time  dependent  and  are,  to  a  large  extent,  predictable. 
Therefore,  it  is  the  randomness  associated  with  the  decision¬ 
maker  that  is  of  primary  importance,  and  the  stochastic  pro¬ 
perties  of  tasks  are  a  second  order  effect  (but  not  neces- 
sarily  negligible).  Moreover,  the  classical  queueing  theory 
places  great  emphasis  on  finding  stationary  measures  of 

Shis  implies  that  a  customer  (task)  may  leave  before  being  served  or 
the  server  (human)  may  refuse  to  service  a  low  priority  customer  (task). 

++ 

With  moderate  complexity,  a  stationary,  state  dependent,  non-preemptive 
priority  policy  in  non- conservative  queueing  systems  can  be  determined 
using  the  tools  of  dynamic  programming.  The  reader  is  referred  to  [11] 
for  details. 
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system  effectiveness,  whereas  the  dominant  issue  in  real 
systems  is  the  determination  of  instant  to  Instant  human 
decision  behavior,  while  servicing  a  time  dependent  demand. 
The  combinatorial  approaches,  on  the  other  hand,  involve 
sequencing  a  finite  number  of  tasks  whose  arrival  times, 
processing  times  and  dead-lines  are  known  deterministically 
(if  random,  mean  values  are  used) .  This  approach  can  not 

i 

handle  randomness  associated  with  the  decision-maker  or  the 
task  parameters  easily.  Thus,  the  incorporation  of  human 
randomness  into  the  decision  strategy  is  difficult  using  a 
sequencing  theoretic  approach.  The  control  and  semi-Markov 
decision  process  approach  to  modeling  the  human  decision 
strategy  in  a  MTDP,  developed  in  Chapter  II  and  in  [10],  sub¬ 
sumes  the  earlier  two  approaches  and  can  explicitly  incor¬ 
porate  human  limitations. 

(vi)  Drawbacks  of  Tulgare  model :  One  major  drawback  of  Tulga's 

model  is  that  the  fundamental  human  limitations  have  not  been 
identified.  First,  it  is  almost  impossible  for  the  human  to 
have  perfect  estimates  of  the  time  available  and  the  time  re¬ 
quired  to  process  a  task.  Second,  it  is  well  known  [12]  that 
the  humans  do  not  respond  to  the  same  stimulus  in  identical 
fashion  at  different  times  (due  to  their  limited  resolving 
power),  even  when  there,  are  no  changes  in  their  information 
or  resources.  This  makes  it  difficult  to  validate/invali¬ 
date  the  truly  normative,  sample-path  (Monte  Carlo)  models 
of  the  type  espoused  by  Tulga.  Third,  it  is  also  well  known 
that  the  human  is  a  sequential  decision-maker  with  limited 


information-processing  capabilities  I 13 J •  Thus,  it  is 
difficult  to  justify  normative,  combinatorial  models  based 
on  dynamic  programming  (DP),  as  they  require  the  specifica¬ 
tion  of  complete  future  courses  of  action  before  any  task  is 
acted  upon.  Moreover,  the  computational  load  of  the  DP 
increases  exponentially  with  the  number  of  tasks  to  be 
sequenced.  On  the  other  hand,  if  a  finite  stage  DP  is 
advocated  as  a  compromise,  then  the  nagging  question  is  how 
to  choose  the  number  of  stages?  The  last  point  is  not  a 
peculiarity  of  Tulga's  model  alone.  It  applies  to  all  the 
behavioral  models  employing  DP  formulation.  The  present 
dynamic  decision  model  (DDM)  overcomes  the  first  two  cited 
limitations  of  Tulga's  model  by  explicitly  including  human 
randomness  in  the  model,  and  circumvents  the  combinatorial 
problem  of  DP  by  postulating  a  myopic  (one-stage)  decision 
policy . 

(vii)  Modes  of  Model  implementation :  If  the  subjects  do  not  come 
from  a  homogeneous  population,  in  terms  of  their  decision 
performance,  then  the  sample  path  (Monte  Carlo)  models  of  the 
type  proposed  in  [8]  make  little  sense.  The  DDM  developed  in 
this  thesis  can  be  exercised  either  in  a  covariance  propaga¬ 
tion  mode  or  in  a  Monte  Carlo  (sample-path)  mode.  The  first 
mode  gives  probabilistic  predictions  necessary  for  model-data 
validation.  This  is  done  in  chapter  III.  The  second  mode  is 
appropriate  for  using  the  model  as  a  decision  aid.  These 
issues  are  explored  in  chapter  IV. 


II.  ANALYTIC  MODEL  FOR  HUMAN  TASK  SEQUENCING 


Our  analytic  approach  to  model  human  decision-making  in  multi-task 
environments  is  based  on,  and  will  extend,  the  optimal  control  model  (OCM) 
of  Kleinman  et  al  [14-16].  The  optimal  control  model  is  a  general  and 
versatile  methodology  for  predicting  human  response  in  stochastic,  multi- 
variable  control  tasks.  The  modeling  approach,  rooted  in  modem  control 
and  estimation  theories,  is  based  on  the  assumption  that  a  well-trained 
and  well-motivated  human  operator  behaves  in  an  optimal  manner,  subject 
to  his  inherent  limitations  and  constraints,  and  the  perceived  task 
objectives.  The  OCM  has  been  applied  successfully  in  a  variety  of  manual 
control  tasks,  as  well  as  in  tasks  that  do  not  involve  closed-loop  control 
[17-18].  However,  all  these  studies  emphasize  either  the  continuous  con¬ 
trol  function  of  the  human,  or  his  ability  to  test  binary  hypotheses. 

They  do  not  address  the  decision-making/task-sharing  roles  of  the  human 
that  gain  prominence  in  supervisory  control,  or  in  semi-automated  sub¬ 
systems  of  the  type  discussed  in  chapter  I. 

This  chapter  extends  the  conceptual  framework  of  the  OCM  methodology 
to  multi-task  situations  in  which  monitoring,  information-processing  and 
dynamic  decision-making  (task  sequencing)  are  the  operator's  main  activi¬ 
ties.  The  basic  idea  of  our  modeling  approach  is  to  integrate  decision- 
directed  elements  within  an  OCM-like  construct.  As  with  the  OCM,  the 
approach  is  normative,  in  that  we  attempt  to  determine  what  a  well-trained 
and  well-motivated  human  operator  should  do,  given  the  task  objectives. 

In  the  sections  below,  the  key  elements  of  OCM  are  outlined  briefly 
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and  the  dynamic  decision  model  (DDM)  of  human  task  sequencing  performance 


emerges. 

2.1  Optimal  Control  Model  of  Human  Response  -  An  Overview 

2.1.1  Background  [14-16]  , 

The  basic  structure  of  the  OCM  is  shown  in  Fig.  6  and  consists  of 
the  following  elements: 

(i)  Perceptual  model :  The  perceptual  model  translates  displayed 
variables,  £(t),  into  noisy,  delayed  perceived  variables 
^p(t),  which  is  the  information  upon  which  the  human  bases 
his  subsequent  estimation,  control  and/or  decision  strategies. 

(li)  Human  Limitations :  The  OCM  includes  time-delay,  human  ran¬ 
domness,  small  signal  threshold  phenomenon,  and  scanning 
effects  in  its  formulation.  The  time-delay,  T,  accounts  for 
the  internal  human  delays  associated  with  visual,  central  pro¬ 
cessing  and  neuromotor  pathways.  Human  randomness  is  assumed 
to  be  manifested  as  errors  in  observing/processing  displayed 
quantities  and  in  executing  intended  control  movements.  Thus, 
observation  noise,  v^(t)  anc*  motor  noise,  v^(t)  are  lumped 
representations  of  controller’s  central  processing  and  sensory 
randomness.  The  non-linear  threshold  in  the  OCM  captures  the 
’’neglect"  phenomenon  exhibited  by  humans  when  observing  small 
stimuli.  Finally,  the  scanning-interference  model  accounts 
for  the  fact  that  the  human  must  allocate  monitoring  attention 
among  the  various  displays  [16]. 

(Hi)  Information-Processor:  The  information-processor  consists  of 
a  Kalman  filter  and  predictor  that  compensate  for  the  psycho¬ 
physical  limitations  of  the  human  to  generate  the  "best" 


OPTIMAL  CONTROL  MODEL  OF  HUMAN  RESPONSE 
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estimate  of  the  (augmented)  system  state  x(t)  from  the  per¬ 
ceived  information  base, 

(iv)  Feedback  Gains:  The  control  task  requirements  are  assumed  to 
be  adequately  represented  by  thd  minimization  of  a  quadratic 
cost  functional*  The  operator’s  commanded  control  input 
u^(t)  *  -Lx(t),  where  the  feedback  gains  L  minimize  the  cost 
functional. 

(v)  Motor  model:  The  motor  model  accounts  for  the  bandwidth  limi¬ 
tations  of  the  human  via  the  neuromotor  dynamics,  (T^s+I)  \ 
and  his  inability  to  generate  noise-free  control  signals  via 
the  motor  noise,  v^(t). 

The  Kalman  filter-predictor,  followed  by  the  feedback  gains,  repre¬ 
sent  the  adaptations  by  which  the  human  operator  optimizes  his  performance 
and  compensates  for  his  inherent  limitations.  In  general,  these  model 
elements  depend  on  the  (human’s  internal  characterization  of  the)  system 
dynamics,  human  limitations,  and  the  task  requirements.  The  Kalman  filter 
generates  the  best  estimate  of  the  delayed  (augmented)  state 

£(t)  =  E{x(t-T) lip(o) ,a<t}  (2.1) 

according  to  an  equation  of  the  form 

£(t)  *  A  £(t)  +  B  u^t-T)  +  Gf  v(t)  (2.2) 

where  the  filter  gains,  G^.,  are  determined  from  a  matrix  Riccati  differ¬ 
ential  equation.  The  quantity 

v(t)  =  ^(t)  -  C  £(t)  (2.3) 


is  the  innovation  process  and  represents  the  difference  between  the  actual 
and  expected  observations.  Basically,  V(t)  is  the  new  information  that 
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is  brought  to  the  filter  by  y^(t).  The  predictor  generates  an  estimate 
of  the  present  state,  x(t),  by  projecting  £(t)  ahead  by  T  seconds  to 
compensate  for  the  time-delay. 

The  state  estimate,  x^(t),  and  its  associated  covariance  matrix,  E(t), 
form  a  sufficient  statistic  for  the  closed  loop  man-machine  system.  In 
other  words,  the  pair  |x(t),E(t)j  can  be  used  as  a  basis  for  determining 
subsequent  control/decision  strategies.  A  second  quantity  of  interest  in 
the  OCM  information-processor  is  the  innovation  process,  \>(t),  defined  in 
Eq.  (2.3).  When  the  internal  model  of  the  Kalman  filter  adequately 
represents  the  controlled  element  dynamics,  the  process  v(t)  is  a  zero- 
mean,  white  Gaussian  noise  process  with  covariance  V^(t)  equal  to  the 
observation  noise  covariance.  However,  when  the  internal  model  and  system 
dynamics  are  not  commensurate,  the  human's  estimate  of  the  system  behavior 
deviates  from  the  observed  dynamic  behavior.  These  differences  produce  a 
non-zero  mean,  correlated  innovation  process.  This  property  can  be  used 


to  develop  models  of  human  failure  detection  [18],  and  to  investigate  the 
effects  of  training  on  human  performance. 

2.1.2  Elements  for  Dec is ion-making /Detect ion 

A  key  feature  of  the  OCM*s  information-processor  is  that  it  provides 
the  statistical  characteristics  of  two  important  variables:  the  state 
estimate  |x(t),E(t)[  ;  and  the  innovation  process  |v(t),  V^(t){  .  These, 

in  turn,  have  provided  a  mechanism  for  studying  selected  decision/detec¬ 
tion  phenomena  in  man-machine  systems.  For  example,  Levison  and  Tanner 
[17]  studied  how  well  subjects  could  determine  if  a  signal  embedded  in 


noise  exceeded  a  given  threshold.  Their  model  assumed  that  the  operator 
was  an  optimal  decision-maker  in  the  sense  of  maximizing  the  subjectively 
expected  utility.  For  equal  penalties  on  missed  detections  and  false 
alarms,  this  rule  reduces  to  a  Likelihood  ratio  test,  which  was  implemen- 
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ted  using  the  sufficient  statistic  |x(t),E(t)|  .  In  another  study,  Gai 
and  Curry  [18]  used  the  OCM  information-processing  submodel  to  analyze 
failure  detection  in  a  simple  laboratory  task  and  in  an  experiment  simu¬ 
lating  pilot  monitoring  of  an  automatic  landing  system.  They  considered 
only  instrument  failures,  and  modeled  the  detection  process  as  a  sequen¬ 
tial  hypothesis  test  on  the  mean  of  the  innovations,  \>(t) . 

These  studies  demonstrate  the  potential  of  modern  estimation  techni¬ 
ques  in  decision-raaking/detection  situations.  An  important  feature  of 
the  work  in  [17-18]  is  that  it  provides  a  validation  of  the  Kalman  filter- 
predictor  submodel  in  tasks  not  involving  closed-loop  control.  When  these 
validation  results  are  combined  with  the  overall  verification  of  the  OCM 
in  manual  control  tasks,  the  potential  of  a  control-theoretic  construct 
for  modeling  human  decision  processes  emerges. 

2.2  Overview  of  Modeling  Approach 

Our  approach  to  modeling  human  decision  behavior  parallels  the  opti¬ 
mal  control  model  of  human  response  in  spirit,  but  not  in  form.  In  the 
OCM,  the  control  and  information-processing  strategies  are  separable. 

Once  an  estimate  of  the  system  state  is  available,  the  linear  feedback 
control  law  uses  this  estimate  as  if  it  were  the  true  state.  Human  limi¬ 
tations  affect  only  the  quality  of  (augmented)  state  estimates. 

This  type  of  separation  has  been  found  to  be  plausible  in  the  present 

dynamic  decision  model  (DDM) .  For  any  task  i  in  the  opportunity  window, 

it  is  possible  to  show  that  T^f(t),  the  time  required  to  complete  task  i 

Rl 

starting  at  time  t;  and  T^(t),  the  time  available /remaining  to  work  on 
task  i  at  time  t,  are  valid  decision  state  variables.  That  is,  these  two 
quantities  satisfy  the  axiomatic  definition  of  a  state  that  it  must  pro¬ 
vide  the  complete  running  summary  of  past  actions  (decisions).  The  joint 
density  of  the  decision  states  of  all  tasks  in  the  opportunity  window  is 
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estimated  from  the  information-processor  of  the  DDM ,  and  provides  suffi¬ 
cient  information  for  the  decision-process.  The  statistics  of  decision 
states,  along  with  the  task  values,  r^(t),  and  a  performance  metric,  are 
used  to  compute  the  decision  strategy.  By  analogy  to  the  control  theore¬ 
tic  OCM,  the  values  r^(t)  play  the  role  of  cost  functional  weights,  while 
the  decision  state  variables  correspond  to  system  state  variables. 

A  block  diagram  of  the  DDM  is  shown  in  Fig.  7.  Each  of  the  N  tasks 
in  the  opportunity  window  is  represented  by  a  dynamic  subsystem  acted  on 
by  disturbances  to  account  for  the  non-stationarities  in  task  characteris¬ 
tics.  The  preceived  outputs  are  delayed,  noisy  versions  of  the 

task  states  jx^  (  and  are  contingent  upon  the  monitoring  process.  The 
preceived  outputs  are  processed  to  produce  the  best  linear  unbiased 
estimates  of  the  task  states  jx^l  >  and  their  associated  covariances 
|eJ  via  a  Kalman  f ilter-predictor  submodel.  The  statistics  of  the  task 
states  jx^,  are,  in  turn,  used  to  determine  the  first  and  second  order 

statistics  of  the  decision  states  j  Tn ,  ,  Cf  and  )T  . ,  O  .  The  statis- 
tics  of  the  decision  states,  along  with  the  task  values,  r^(t);  are 
combined  to  determine  the  attractiveness  measure,  M^Ct),  of  each  task  in 
the  opportunity  window.  Subsequently,  the  measures  are  used  to  generate 
the  probability  P^^(t)  of  acting  on  each  of  the  N  tasks  and  the  probabil¬ 
ity  P  (t)  of  not  acting  on  any  task  (or  the  monitoring  probability, 
dU 


P 


dm 


(t)). 


The  next  few  sections  expand  briefly  on  various  features  of  DDM. 

2.3  System  Dynamics 

In  formulating  the  multi-task  decision  problem,  it  is  convenient  to 

differentiate  among  the  process  state  or  Markov  state,  £;  the  set  of 

task  states, x_. ;  and  the  set  of  decision  states,  x,. 

I  i  —al 


In  the  present 


ISTURBANCES 


DYNAMIC  DECISION  MODEL  OF  HUMAN  TASK  SEQUENCING  PERFORMANCE 


33 


experimental  context,  the  process  state,  s^,  is  related  to  the  status  of 
the  CRT  display  and  indicates  whether  or  not  a  task  is  present  on  each  of 
the  K(«5)  lines,  K  being  the  system  capacity.  The  task  state,  ,  de¬ 
scribes  the  dynamical  variables  internal  to  each  task  i.  In  the  present 
experimental  paradigm,  the  task  state  consists  of  the  instantaneous  posi¬ 
tion  and  velocity  of  the  bar  and  the  time  required  to  process  the  task. 
Finally,  the  decision  state,  x^,  consisting  of  time  available  and  the 
time  required  to  process  task  i,  is  a  memoryless  functional  transformation 
of  the  task  state,  These  notions  are  formalized  below. 

2.3.1  Process  state  or  Markov  state,  s_ 

Since  the  number  of  tasks  in  the  system  is  less  than  or  equal  to  K 
(system  capacity),  the  process  state  of  the  multi-task  system  at  an  arbi¬ 
trary  time  t  can  be  represented  by  a  K-dimensional  row  vector  as 

s  -  (I1  l2...Ii...IK)  (2.4) 

where  the  binary  index  variable  I^(t)  *s  8*ven  by 

!1  if  there  exists  a  task  on  line  i  at  time  t 
0  otherwise 

The  total  number  of  possible  process  states  are  2  .  Note  that  the  number 
of  tasks  in  the  opportunity  window  at  any  time  t  is  given  by 

K 

N  (t)  =  ^2  Ii(t) 
i-1 

We  let  A(t)  denote  the  set  of  N  available  (accessible)  tasks  in  the 
opportunity  window  at  time  t.  Formally, 

A(t)  -  -  1;  i-1, 2 . k|  (2.5) 


> 


Cleari > ,  the  decision  set  P(t),  the  set  of  (N+l)  feasible  decisions  at 
time  t,  is  given  by 

P(t)  =  A(  t)  +  Jo| 

Thus,  the  (N+l)  possible  decisions  at  any  time  t  are  to  attend  to  one  of 
the  N  tasks  in  the  opportunity  window,  or  do  nothing.  The  set  of  feasi¬ 
ble  decisions  is  time-varying  as  a  consequence  of  estimation  and  actions 
bv  the  DM,  and  as  a  result  of  the  arrival  of  new  tasks  with  different 
at t  ributes . 

2.3.2  Task  State,  Xn, . 

For  any  task  on  line  i,  the  time  required  to  complete  task  i, 
the  position  of  the  bar  from  the  left  edge,  t and  the  velocity,  v^,  of 
the  bar  constitute  the  task  state  variables  as  shown  in  Fig.  8. 
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The  state  variable  x^^t),  denoting  the  time  required  to  complete  task 
i  starting  at  time  t,  is  action  oriented*  Its  evolution  can  be  charac¬ 
terized  by  the  differential  equation 

xTil(t)  -  TR1(t)  -  -d^t)  (2.7) 

where  d^(t)  is  a  binary  decision  variable  given  by 

!1  if  the  decision  is  to  act  on  task  i  at  time  t 
0  otherwise 


Since  the  human  can  act  on  only  one  task  at  any  given  time,  we  have  the 
following  constraints  on  the  decision  variables: 

di(t)  *  1  implies  dj(t)  ■  0  i,  j  e  P(t) 

where  dQ(t)  *  1  refers  to  the  "do  nothing"  or  monitoring  decision.^ 

The  remaining  task  state  variables,  representing  the  position  and  velo¬ 
city  of  the  bar,  are  given  by 


*  ^l(t)  ”  XT13(t* 
xT13(t>  *  vt(t)  -  w±(t) 


(2.8) 


where  w^(t)  Is  a  zero-mean,  white  Gaussian  noise  with  variance  W^(t) 
that  accounts  for  (perceived)  non-stationarities  in  task  velocity* 

In  vector-matrix  form,  the  dynamics  of  the  task  state  can  be  represented 


Note  that  the  defining  differential  equation  for  TR.  assumes  a  preempt- 
resume  processing  discipline,  while  the  experimental  paradigm  was  de¬ 
signed  to  operate  in  a  preempt-repeat  mode.  The  form  of  Eq  (2.7)  was 
chosen  after  examining  the  experimental  data,  which  showed  that  the 
human  seldom  preempted  a  task  in  all  the  experimental  conditions  studied. 
However,  it  is  straightforward  to  include  the  effects  of  a  preempt-repeat 
mode  of  processing  by  reinitializing  the  dynamical  equation  for  TRi,  every- 
time  di(t)  switches  from  1  to  0  and  TR^(t)  is  non-zero. 


*  A^ft)  +bWi(t)  -^d^t);  ieA(t) 


(2.9) 


where 


ail  '  lxm'*Ti2,>tti3J  ' 


The  subsystem  state  in  Eq  (2.9)  is  reinitialized  to  the  new  task  attri¬ 
butes  every time  a  new  task  arrives  on  line  i. 


2.3.3  Decision  State,  x 


The  decision  state,  “  [TRi>  ^ai^  is  re*ate<*  to  tasic  state 
via  a  functional  transformation  as 


**!<*>  *  1(^(0) 

In  the  present  experimental  context,  the  time  required  to  complete  task  i 
starting  at  time  t,  Tj^(t) ,  is  given  by 

TRi(t)  *  xdil^  "  *Til^  (2.10) 

The  other  decision  state  variable  Tfl^(t) ,  the  time  available  to  work  on 
task  i  at  time  t,  is  related  to  the  task  state  x^  via 


L  -  x_  (t)  h  -  l  (t) 

T  (t)  sb  v  (t)  *  -  »  ■  m  —— — 

ai'  '  Xdl2{t)  ^13(t)  vt(t) 

where  L  is  the  length  of  the  opportunity  window  (-  12”) « 


(2.11) 


In  the  present  experimental  paradigm,  T  . (t)  is  assumed  to  be  independent 

al 

of  This  is  not  a  restrictive  assumption*  If  the  nature  of  rela¬ 

tionship  between  Tft^(t)  and  TR^(t)  is  known,  it  can  be  incorporated  into 
the  model  formulation  in  a  straightforward  manner* 


2.3.4  Preceptual  Model 

Since  the  processing  times  are  quantized  In  steps  of  1  sec.,  the 
displayed  Information  consists  of  a  modified  version  of  the  task  state, 
Thus, 


ft  [TR1(t)Jl 


y^t)  - 


L  v. (t) 


-  +  &  v  (t) 


(2.12) 


where  v^(t),  the  linearized  quantization  error,  Is  bounded  by 


-0.5  <  vq(t)  <  0.5 


(2.13) 


In  order  to  represent  the  effects  of  quantization  on  the  estimation  pro¬ 


cess,  it  is  frequently  assumed  [19]  that  v^(t)  is  uncorrelated  with 


TRi(t),  and  that  it  is  a  stationary,  zero-mean,  white  noise  process  uni¬ 


formly  distributed  over  the  range  of  quantization  error  of  Eq  (2.13). 


The  autocovariance  of  the  noise  process  v^(t)  can  be  shown  to  be 


E  [vn(t)  vn (a) ]  -  Vn  6(t-o)  -  6(t"a> 


(2.14) 


Following  usual  practice,  the  human  is  assumed  to  perceive  a  noisy, 
delayed  and  linearized  replica  of  x^(t)  given  by 


ipi(t)  “  ^(t- T)  +  Vyi(t-T) 


(2.15) 


where 


T  -  the  human's  time  delay  (£  .2  sec) 


v^^(t)  -  the  observation  noise  at  time  t 


The  observation  noise  v^(t)  is  a  zero-mean,  white  Gaussian  noise  process 


with  diagonal  covariance  matrix  V  , • 


As  with  the  OCM,  the  diagonal 


lT 
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elements  of  the  observation  noise  covariance  matrix  associated  with  the 
task  position  and  velocity  are  functionally  related  to  the  monitoring 
strategy  and  the  mean-square  values  of  the  corresponding  output  variables 
according  to 

Wj/w  'KH  «•“> 

where 

y^j  ■  j  th  element  of  the  vector  j“l,2,3 

*  noise  to  signal  ratio  (NSR)  associated  with  ^  of  task 
i  (~  .01) 

f^(t)  *  monitoring  allocation  to  task  i 

There  is  assumed  to  be  no  intratask  attention  allocation  among  the  indi¬ 
vidual  components  of  the  displayed  variables,  as  the  task  information 
is  presented  in  an  integrated  form.  Thus,  f^(t)  is  the  monitoring  atten¬ 
tion  to  task  i.  Also  since  the  (linearized)  quantization  error  over¬ 
shadows  the  inherent  human  randomness  in  perceiving  the  decision  state 
variable,  TR^(t),  the  observation  noise  covariance  can  ne8“ 

lected  in  comparison  to  V^.  In  summary,  the  time-histories  of  ^^(t)  are 
the  stimuli  upon  which  the  human  bases  his  subsequent  estimation  and 
decision  strategies. 

2.4  Monitoring  Strategy 

The  monitoring  allocations,  f^(t),  affect  the  subsequent  decision 
strategy.  On  the  other  hand,  the  specifics  of  the  experimental  paradigm 
determine  whether  or  not  the  monitoring  strategy  is  dependent  on  the 
decision  strategy.  In  the  present  experimental  context,  if  a  task  i  is 
acted  upon  at  time  t  (i.e.,  d^tj-l),  it  is  also  monitored.  However, 
there  exist  two  possibilities  for  the  other  tasks  j^i: 


v  ■ 

[V‘ 
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(i)  Parallel  monitoring :  In  this  case,  all  tasks,  Including  the 
one  being  acted  upon,  can  be  monitored  simultaneously.  (This 
corresponds  to  experimental  conditions  A-D) .  Here,  an  equal 
(monitoring)  attention  allocation  strategy,  l.e.,  f^(t)"j|,  Is 
found  to  be  adequate  for  model  applications.  This  result  is 
not  surprising,  since  an  overview  on  the  existing  monitoring 
models  [10,16]  Indicates  that  the  overall  system  performance 
is  not  very  sensitive  to  changes  in  the  monitoring  process 
over  a  reasonable  range  of  variation  about  the  optimal  strat¬ 
egy,  at  least  for  well-designed  displays. 

(ii)  No  Parallel  monitoring :  In  this  case,  tasks,  other  than  that 
being  acted  upon,  are  not  available  for  monitoring  (experi¬ 
mental  condition  B^) ,  but  monitoring  of  all  tasks  is  an 
explicit  decision  alternative.  Here,  the  monitoring  process 
is  strongly  coupled  to  the  decision  strategy.  Noting  that 
f^(t)  is  the  ensemble  probability  of  monitoring  task  i  at 
time  t,  we  have  by  the  total  probability  rule 

f4(t)  »  P  (monitor  task  i  at  time  t} 

P  (monitor  i,  act  on  j} 

P  (monitor  ilact  on  j}  •  P..(t) 

dj 

P.  (t) 

+  -^ -  ;  i  c  A(t)  (2.17) 


•  E 

jep(t) 

-  E 

jeP(t) 

*  pdi<‘> 


I 


where  it  is  assumed  that  the  monitoring  probability,  P.  (t), 
is  equally  distributed  among  N  tasks. 
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2.5  Information  -  Processor 

The  information-processor  compensates  for  the  human's  inherent  ran¬ 
domness,  time-delay  and  monitoring  allocations  to  produce  the  "best" 
estimate  of  the  decision  state  from  the  perceived  information  base.  As 
with  the  OCM,  the  information-processor  consists  of  a  Kalman  filter  and 
a  linear  predictor.  This  choice  was  motivated  by  the  results  of  [17-18], 
which  provided  an  independent  verification  of  the  filter-predictor  struc¬ 
ture  for  the  information-processor  in  situations  not  involving  closed- 
loop  control.  The  Kalman  filter-predictor  submodel  generates  the  best 
linear  unbiased  estimates  of  the  task  state,  x^Ct)  and  its  associated 
covariance  matrix,  E^(t).  The  pairs  {£Ti(t),  E^(t)}  are  subsequently 
used  to  compute  the  first  and  second  order  statistics  of  the  decision 
state,  x^t),  viz.,  the  pairs  ,  O^t)}  and  Ct ) ,  ^(t)}  for 

each  task  i. 

2.5.1  Kalman  Filter 

The  Kalman  filter  generates  the  best  linear  unbiased  estimate  of 
the  delayed  state 

£j(t)  =  E  j  xT1(t-t)/^1(a)  ;  O  <  t| 
according  to  an  equation  of  the  form 

£j(t)  =  A  £A(t)  -  £  di(t-T)  +  G1(t)  f  v  ±(t)  -  ^(t)  ]  (2.18) 

with  the  initial  condition  jg^t^  +  T)  ■*  ^Ri^Oi^’  vi^t01^  *  Here 
tp^  is  the  initiaJ  (arrival  or  ready)  time  of  a  task  on  line  i,  and 
T  (t  )  is  the  a  priori  mean  of  the  processing  time. 

K 1  Ui 

The  f liter  gains  G^(t)  are  given  by 

-1 


Gt(t)  *  >:t(t)  fvyi(t-T)  +£Vqi'] 


(2.19) 
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where  E.(t)  Is  generated  from  the  usual  Riccati  equation 

I.  *  A  E.  +  Z.  A'  -  L.  [V  ( t— T )  4-  K  V  g']"1  Z.  +  b  W, (t-T)  b*  (2.20) 
i  ii  iyi  q  —  i—i  — 


with  the  initial  condition 


Zi  (t0i  +  T)  “  dia8 


(T  -  T  )‘ 

v  nij  wT  y 

TT 


,  0  ,  .01  TT  Vi(tQi) 


where  T  and  T  are  the  a  priori  maximum  and  minimum  values  of  the 

Rli  RL 

processing  time.  The  initial  uncertainty  in  the  velocity  estimation  is 
assumed  to  scale  with  the  square  of  the  velocity  in  accordance  with  the 
Weber’s  law. 

2.5.2  Linear  Predictor 

Prediction  of  the  present  task  state,  x.T|(t)  >  is  obtained  by  inte¬ 
grating  the  vector-matrix  linear  differential  equation 


xTi(°)  = 


a  £Ti(o) 


E 


d±(0) 


(2.21) 


from  O  =  t-'l  to  a  =  t  with  the  initial  condition  3^(t-T)  =  £^(t). 

A 

The  error  covariance  associated  with  the  task  state  estimate  x^(t), 
denoted  by  F^(t),  is  given  by 


E,(t)  -  e 


A  l 


A*  T 


f  A(t-o)  ,  . .  ,  ,  A’(t-a) 

I  e  _b  W^(o)  e  do 

t-T 


(2.22) 


2.5.  3  Statistics  of  the  decision  state,  x , . 

The  statistics  of  the  decision  state  variable  T  (t)  are  readily 

Kl 

computed  from  those  of  the  task  state,  J^rr,(t)9  as 


T  (t)  =  conditional  mean  »  x^^Ct) 

2 

o  (t)  =  conditional  variance  s  E  (t) 

K 1  ill 


(2.23) 
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The  human's  perception  of  the  conditional  density  of  the  decision  state 

A 

variable  T  (t)  is  assumed  to  be  Gaussian  with  mean  T  (t)  and  variance 
Ri  K1 

<&<»• 

In  order  to  compute  the  statistics  of  the  remaining  decision  state 
variable  Tai(t)»  we  note  from  Eq  (2.11)  that  it  involves  the  ratio  of  two 
Gaussian  random  variables.  If  the  observation  signal-to-noise  ratio  (SNR) 
is  sufficiently  high^*,  then  it  can  be  shown  [20]  that  T^(t)  is  approxi¬ 
mately  a  Gaussian  random  variable.  An  unbiased  estimate  of  T^(t)  and 
its  variance  can  be  evaluated  by  linearizing  Eq  (2.11)  about  the  condi- 
tional  unbiased  estimates  &^(t)  and  v^(t)  as 


Tai(t) 


L-£.(t) 

v.(t) 


L41(t)-u1(t)4i(t)] 

v.(t)+[v1(t)-0i(t)] 

L-^Ct)  £1(t)4.(t)  [L4i(t)][v1(t)-v1(t)] 


v^t) 


v±(t) 


^2 (  . 
v1(t) 


Using  Eq  (2.24),  we  have 


(2.24) 


Tal(t) 


^1<‘> 


*  Conditional  mean 


vt(t) 


Conditional  variance 


(2.25) 


Ei22  (t).-E133(t)Iai(t)^E123(t)Ta1(e) 

5j(t) 


Due  to  the  scaling  nature  of  the  noise  processes  in  the  information 
2 

■ -  v 

SNR  »  10  £og1n  -zz, -  should  be  (approximately)  greater  than  12  db.  This 

10  ^133 

condition  is  almost  always  satisfied  in  man-machine  applications. 


'Mr 
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processor,  one  might  expect  that  *±23^2 

/v2 

and  E^33^£^  v^t)  where  e^,  e^,  z^  >  0.  Therefore,  Eq  (2.25)  implies, 
albeit  heuristically,  that 

°ai(t)  =  £  ^l(t>  *  e  *  0 

Thus,  the  standard  deviation  of  time  available,  is  likely  to  scale 

A 

with  its  conditional  mean,  Tai(t).  This  is  intuitively  appealing. 

In  summary,  the  decision  state  variables,  x,.  =  [T_. ,  T  .]*  of  the 

-Hii  Ri  ai 

ODM  are  assumed  to  be  normal  with  the  non-stationary  perceived  density 

and  distribution  functions,  y,  (T_  . ;  t>  ,  r  (T  ;  t)  and  <P.(T  .  ;t),  $  (T  .  ;t) 

i  ki  i  ki  i  ai  i  ai 

respectively.  That  is, 


(2.26) 


The  conditional  Gaussian  statistics  of  the  decision  state  from  an  impor¬ 
tant  input  to  the  decision  process  as  shown  in  Fig.  7. 

2 . 6  Decision  Strategy 

In  this  section,  the  multi-task  decision  problem  is  formulated  in 
the  framework  of  a  non-stationary,  semi-Markov  decision  process  (SMDP). 

Via  this  formulation,  the  combined  statistics  of  the  decision  states  of 
N  tasks  are  used  to  compute  the  transition  probabilities  among  the 
various  process  states  for  each  of  the  decision  alternatives.  The  transi¬ 
tion  probabilities,  along  with  the  task  values,  are  used  to  determine  the 
attractiveness  measures  of  tasks,  employing  the  subjectively  expected 
value  (SEV)  as  a  criterion  of  performance.  These  measures  form  an  input 
to  a  stochastic  choice  model  that  generates  the  decision  probabilities. 

The  decision  process  is  depicted  in  Fig.  9,  and  is  elaborated  next. 


4  \  4  ' 


*  ■ » mma*  • 
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{ri }  =  Task  Values 


{1Ri’°Ri} 


1Tai’“a1» 


Formulation 


Transition 

- ^ 

Probabilities 


Criterion 


Stochastic  T^dl ^ 
Choice  h-P* 


JAttracti veness  Model 
Measures  L 


Fig.  9:  HUMAN  DECISION  PROCESS 


2.6.1  Semi-Markov  Decision  Process  Formulation 

Recall  that  a  non-stationary  SMDP  is  characterized  by  the  hexad 
(S(t),  P(t),  E(t) ,  T(t),  H( t),  r ( t) ) ,  where 

S(t)  =  set  of  process  (Markov)  states  of  the  system  (state  space) 
V(t)  =  set  of  possible  decisions  (action  set) 

E(t)  =  set  of  events  (event  set) 

T(t)  =  set  of  transformation  rules  that  describe  the  changes  in 
the  state.  This  is  usually  expressed  in  terms  of  transi¬ 
tion  probabilities . 

H(t)  =  Holding  time  function  that  determines  how  long  the  system 
stays  in  a  given  state  before  making  transition  to  another 
specified  state.  This  is  expressed  in  terras  of  holding  time 
density  functions. 

r(t)  =  set  of  rewards  associated  with  each  state  transition  (reward 
structure) . 

Thus,  in  order  to  formulate  the  multi-task  decision  problem  as  an  SMDP, 
we  need  to  specify  the  process  descriptors  S(t),  E(t),  T(t),  tf(t),  r(t) 


•Mi* 


*mj<r  • 


The  state  space  S  is  time  invariant  and  consists  of  2  elements 
corresponding  to  2  possible  realizations  of  the  process  state  js,  where 
K  is  the  system  capacity.  Symbolically, 

K 

S  =  {set  of  2  process  states,  s_} 

Associated  with  a  process  state  _s  at  time  t,  there  exist  N  pairs  of 

decision  state  variables  {T  (t) ,  T  .(t)},  ieA(t).  Here,  A(t)  is  the 

iv  i  a  l 

set  of  N  available  tasks  in  the  opportunity  window  at  time  t. 

B.  Event  set,  E(t),  and  the  Transformation  Rule,  T(t) 

The  transformation  rule  (or  the  law  of  motion),  T( t),  is  expressed 

in  terms  of  transition  probabilities  p^  ,(t),  where  s  is  the  process 

ss  — 

state  at  time  t,  s/  is  the  destination  process  state  after  a  random  holding 

time  T  in  process  state  s^,  and  i  £  A(t)  denotes  the  action  on  a  task  i  in 

the  opportunity  window.  The  destination  state  sf  depends  on  the  values  of 

(N+l)  independent  random  variables,  T_.(t)  and  T  (t),  ra  £  A(t);  associ- 

K.1  am 

ated  with  the  process  state  and  the  decision  to  act  on  task  ?  at  time 
t*  It  is  clear  that  a  decision  to  act  on  task  i  results  in  one  of  the 
following  process  state  transitions  shown  in  Fig.  10. 

(i)  Successful  Completion  or  loss  of  task  i:  The  task  i  is  said 
to  be  successfully  completed  if  the  random  variable  T^(t) 

■f 

Note  that  this  formulation  assumes  complete  ignorance  of  the  random 
variables  associated  with  the  future  arrivals  on  the  (K-N)  empty  lines. 
That  is,  transitions  to  process  states  corresponding  to  arrivals  on 
empty  lines  are  not  included  in  this  formulation.  This  implies  that  the 
decision  strategy  depends  only  on  the  characteristics  of  tasks  in  the 
opportunity  window.  If  the  probabilistic  information  regarding  future 
arrivals  is  available,  it  can  be  incorporated  into  the  decision  strategy. 
The  reader  is  referred  to  Ref.  [10]  for  details. 


successful  completloi 
or  loss  of  task  1 


time  t+T 


Fig.  10:  PROCESS  STATE  TRANSITION  DIAGRAM  OF  THE  MTDP 


of  task  i  is  greater  than  zero,  but  is  less  than  the  available 

times,  T  (t) ,  of  all  the  tasks,  including  i,  in  the  opportu- 
am 

nity  window.  On  the  other  hand,  task  i  is  said  to  be  lost  if 
the  random  variable  T^(t)  is  greater  than  zero,  but  less 
than  TR^(t)  and  j  ^  i*  In  any  case,  the  new  process 

state  s *  *  s  -  e. ,  where  e_,  is  a  K-* dimensional  unit  row  vector 
whose  i  th  component  is  one  and  whose  other  components  are 
zero. 

)  Loqq  of  a  task  j  /  i:  This  event  occurs  if  Ta^(t)  is  greater 

than  zero,  but  less  than  T0,(t)  and  T  (t) ,  m  +  j.  When  this 

Rl  am 


event 


occurs,  the  new  process  state  e,*  ■  jj  -  e^, 
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Thus,  the  destination  process  state,  s/ ,  and  the  random  holding  time, 
T,  depend  on  the  outcome  of  a  race  among  the  (N+l)  competing,  non-station- 
ary  random  processes  TR±(t)  and  T^Ct),  m  £  A(t)  associated  with  the  pro¬ 
cess  state  £  and  action  i.  This  type  of  semi-Markov  decision  process, 
wherein  the  state  transitions  are  determined  by  a  race  among  several 
random  processes  is  known  as  a  "competing  semi-Markov  decision  process" 
[21].  It  should  be  emphasized  that  the  analysis  of  MTDP  is  complicated 
by  the  fact  that  the  transition  probabilities  and  the  random  holding  time 
functions  are  non- stationary. 

It  is  clear  from  the  above  event  description  that  there  are  N 
possible  process  state  transitions  of  interest  from  state  s_.  In  general, 
the  destination  process  state  s/  =  £  -  e^,  m  £  A(t).  In  the  following, 
the  transition  probabilities  for  the  admissible  destination  process 
states  are  computed.  We  suppress  the  time  dependence  of  the  density  and 
distribution  functions  of  the  decision  states  for  ease  of  notation. 

Ca)  Probability  of  event  (i) :  This  is  the  probability  that  the  new 
process  state  s/  =  s_  -  e^  given  that  the  present  process  state  is  £  and 
the  decision  is  to  act  on  task  i.  Thus, 

Pss'(t)  =  ni(t)  +Wii(t)  ;  s’  =  s  -  (2.27) 

whe  re 

Hi(t)  =  P{action  on  task  i,  task  i  is  successfully  completed, 
other  channels  intact) 

*  I  I  p{TD,  <  T  } 

11  Ri  —  am 

m  £  A(t) 

[l-<t>  (T)]  Y.(T)  dT 
_  m  i 

o  m  i  A(t) 

u)^(t)  -  PtactLon  on  task  i,  task  i  is  lost,  other  channels  intact) 
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=  PiT  <  T  }  •  11  P{T  <  T  } 

Ri  me  A(t)  ai  310 

m  ^  i 


.n 

J  1  J 


-  J  [1-i  (t)]  *  1  1  [i-4>  (t)]  .  <j)  (T)  dT 

■  e  Act>  m  1 

m  ^  i 

(b)  Pi'obabilities  associated  with  event  (ii)  :  This  is  the  probability 
that  the  new  state  =  £  -  e.  ;  j  f  i,  given  that  the  present  state  is 
s_  and  decision  is  to  act  on  task  i.  Therefore, 


P>(t)  =  5  £'  =  £  ~  J  j.  +  i 


(2.28) 


where 


U ij (t)  *  P(aet ion  on  task  i,  an  accessible  task  j  other  than  i  is 


lost,  all  the  other  tasks  intact) 

PIT  <  T  }  *  Fl  P{T  .  <  T  } 

JJ  Rl  m  e  A(t)  aj  "  am 

m  +  j 


=  f  ll-r.(T)]  •  Fl  [!-<**,  (T)  ]  4> .  ( i )  dr 

JQ  1  me  A (r)  m  J 

m  i  j 

In  summary,  the  N  transition  probabilities  for  each  i  e  A(t)  are 


,i  (t)  =  (  Vt}  +  “!!<*>  »  — '  *  £  “  «i 

I  u,ij(t)  ;  £’  =  ‘  £j  i  i  *  1 


(2.29) 


In  Ret  [10],  numerical  quadrature  formulae  of  Hermite  [22],  and  Steen, 
Byrne  and  Celbard  [ 2 3 J  were  suggested  as  a  means  to  compute  the  required 
transition  probabilities.  However,  the  computation  of  transition  proba¬ 
bilities  can  be  greatly  simplified  using  Lucefs  choice  axiom  [24-27], 
which  is  ideally  suited  to  determine  the  probability  that  a  certain 
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random  variable  is  the  minimum  (or  maximum)  among  a  set  of  random  varia¬ 
bles.  This  is  precisely  the  problem  of  interest  in  generating  the  transi¬ 
tion  probabilities.  For  example,  ^(t),  the  probability  that  TRi(t)  is 

less  than  T  (t);  m  £  A(t)  *  can  be  computed  via  Luce's  choice  axiom 
am 

according  to 


y  p^(t>-TR1(t)  <  o) 
“L.  PtTRi<t)-I™<t>  ~ 01 

m  e  A(t) 


(2.30a) 


The  main  assumption  underlying  Luce's  choice  axiom  is  that  the  removal  of 
some  alternatives  (random  variable,  in  our  case)  does  not  alter  the  rela¬ 
tive  probabilities  of  choice  among  the  remaining  alternatives*  In  other 
words,  the  presence  or  absence  of  an  alternative  is  irrelevant  to  the 
relative  probabili t ies  of  choice  among  the  remaining  alternatives,  al¬ 
though  the  individua]  probabilities  will  generally  be  affected.  The  proof 
of  the  form  of  Eq  (2.30a)  is  included  in  Appendix  A. 

Since  the  decision  state  variables  are  assumed  to  be  Gaussian,  Eq 
(2.30a)  simplifies  to 


vhe  re 


n.(t) 


i 


♦  E 

m  £  A(t) 


1  +  Erf  (A.  )  1 

im 

1  -  Erf  (ilm) 


T  -  T 
Ri  am 


im 


n 


+  a 

Ri  am 


and  Erf(a) 


_2 


c  2 

hu 


d  t  ;  Erf  (°°)  =1 


(2. 30b) 


Using  a  well  known  result  [24]  that  the  logistic  function  is  a  good  ap- 
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proximation  to  the  cumulative  normal,  Eq  (2.28b)  can  be  further  simpli¬ 
fied  as 


n 


i 


1  + 


s 

m  £  A(t) 


(l  +  exp  (Aim)  )  “ 
(1  +  exp  (-A^)} 


-1 


(2.30c) 


The  computation  of  remaining  transition  probdbilites  m  e  A(t)  pro¬ 

ceeds  along  similar  lines  to  Eq  (2.30). 

C,  Holding  time  function,  H(t) 

The  holding  time  function  is  specified  in  terms  of  holding  time 
density  functions,  h^s,(T),  which  determine  how  long  the  system 
stays  in  process  state  is  before  making  a  transition  to  a  specified  state 
s’.  The  density  function  h^t(x)  can  be  obtained  by  first  determining 
the  joint  probability  -  probability  density  function,  f  , (x),  f°r 
event  that  a  system  in  process  state  will  make  its  next  transition  to 
process  state  s'  after  a  holding  time  t,  while  acting  on  a  task  i  c  A(t). 
Thi9  event  will  occur  in  the  competing  SMDP  only  if  the  random  variable 
representing  the  destination  process  state  s/  takes  on  the  value  X  and  all 
the  other  N  random  variables  are  greater  than  I.  Therefore, 


f 


i 

S6? 


(D 


vt>  •  n  !i"t»<T)i  ■ 

m  £  A(t) 
m  f  i 

•  {Yi(T)[l'4>i(t)]-H>1(t)tl-ri(T)jf;  s'  =  s  -  e. 


(2.31) 

B±J (x)  =  ♦  (T)  •  [l-r±(X) ]* 

.  j” J  [l-*m(T)J;  s*  *=  i  -  e j  ;  J  4  i 

m  £  A(t) 

m  5*  j 
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Note  that  the  transition  probabilities,  P  ,(t),  of  Eq  (2.27)  are  related 

8  8 

to  f‘  , (x)  via 
ss 


P1  , (T) 
ss*  N 


oo 

I 


(x)  dt  for  all  s*  c  S 


The  holding  time  density  functions,  h  ,(t)  are  given  by 

ss 

1  fss’(x) 

h  ,  (t)  *  — j -  for  all  allowable  j3,  js1 

ss  P«a-(T) 

ss 


(2.32) 


It  should  be  emphasized  that  the  functions  f*gl(x)  and  h^gl(x)  are  non¬ 
stationary.  Closed  form  expressions  for  the  holding  time  density  func¬ 
tions  are  not  possible  and,  hence,  must  be  computed  numerically  [10,22,23]. 
D.  Reward  Structure,  r(t) 

When  the  transition  from  process  state  &  to  process  state  s/  occurs 

at  some  time  t  +  X,  the  DM  earns  an  expected  reward  r*  ,(x)  in  the  form 

s  s 

of  a  bonus.  That  is,  the  DM  earns  a  lump  sum  payment  at  the  time  of  state 
transition,  a  payment  that  depends  on  process  states  £,  js1  ;  the  holding 
time  x,  and  action  i.  In  the  present  MTDP,  a  reward  ("bonus")  of  r^t) 
units  is  earned  while  acting  on  task  i  if  and  only  if  the  new  process 
state  s/  =  £  -  e^  and  the  task  i  is  successfully  completed.  The  condi¬ 
tional  probability  that  task  i  is  successfully  completed,  given  that  the 
new  process  state  s'  =  £  -  e^  and  action  on  task  i,  is  n^(t) / [r^ (t)-^^  (t)  ] . 

In  addition,  if  it  is  assumed  that  there  is  a  penalty  of  q  (t)  units  for 

m 

losing  a  task  m  e  A(t),  then  the  reward  structure  can  be  described  by 


r 


i 


89 


f 


(D 


[r.(t)n  (t)  -  q,(t)w  (t)] 

— i _ i - i - — -  .  s'  =  s  -  e 

[nt<t) +»il<t>]  ’  -  - 

(2.33) 

-  q  (t)  ;  s'  -  a  -  e  ;  m  i  A(t) 
m  — m 
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It  should  be  noted  that  r*s? (r)  is  the  conditional  expected  reward,  given 
the  holding  time.  Even  though  there  are  no  penalties  for  missed  tasks  in 
the  present  MTDP,  q^Ct)  of  Eq  (2,33)  could  still  represent  the  subjective 
losses  (utilities)  assigned  by  the  DM.  A  logical  choice  for  the  subjec¬ 
tive  values  q  (t)  are  the  objective  rewards  r  (t).  The  reward  structure 
m  m 

of  Eq  (2.33)  can  be  generalized  to  include  decision  dependent  penalties, 
as  well  as  a  continuous  yield  rate  [10]. 

E.  Action  Set,  P(t) 

At  any  time  t,  the  DM  is  provided  with  (N+l)  choices:  act  on  one  of 
the  N  tasks  in  the  accessible  set  A(t)  or  not  act  on  any  task  (i.e.,  do 
nothing  or  monitor).  Thus,  we  have 

V(t)  =  A(t)  +  (0} 

The  number  of  choices  may  differ  from  one  process  state  to  another.  Some 
process  states  may  have  only  one  alternative  and,  therefore,  choice  is 
constrained  whenever  such  a  process  state  is  occupied.  The  DM1 s  problem 
is  to  select  the  actions  (over  time)  that  will  make  the  operation  of  the 
system  most  reward Lng. 

2.6.2  Attractiveness  Measures,  M^(t) 

The  basic  assumption  underlying  the  human  response  modeling  is  that 
a  we 11- trained  human  behaves  in  a  normative,  rational  manner  subject  to 
his  inherent  limitations.  We  interpret  this,  mathematically,  in  terms  of 
maximizing  a  specified  metric.  As  with  the  OCM,  the  choice  of  a  metric 
may  be  either  objective  (specified  by  the  experimenter),  or  subjective 
(adopted  by  the  human  in  perofrming  and  relating  to  the  task).  In  the 
present  experimental  context,  the  objective  metric  involves  the  maximiza¬ 
tion  of  reward  earned.  Since  the  proposed  model  is  normative  in  construct, 
we  need  to  specify  a  subjective  metric.  If  the  subjective  metric  is  the 
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same  as  the  objective  metric,  then,  as  shown  in  [10J,  a  functional  equa¬ 
tion  for  the  optimal  decision  strategy  can  be  derived  using  dynamic  pro¬ 
gramming  (DP)  and  semi -Markov  decision  process  theory.  However,  the 
tree-folding  back  procedure  of  the  DP  presents  serious  computational 
difficulties  ("curse  of  dimensionality"),  and  requires  the  evaluation  and 
specification  of  all  future  courses  of  action  before  any  task  is  acted 
upon.  The  latter  point  is  at  variance  with  the  current  psychological 
knowledge  of  a  human's  inability  to  forsee  the  complete  future  effects  of 
his  present  decisions.  If  a  finite  stage  DP  is  advocated  as  a  compro¬ 
mise,  we  are  faced  with  the  dilemma  of  selecting  the  number  of  stages. 
These  observations  led  us  to  the  choice  of  the  subjectively  expected 
value (SEV)  of  a  decision  as  our  metric  (or  "attractiveness  measure")  for 
optimization.  It  is  easy  to  show  [10]  that  SEV  corresponds  to  a  myopic 
(one-stage)  policy,  which  can  be  derived  from  the  DP  formulation  by 
completely  disregarding  future  rewards.  That  is,  the  myopic  decision 
policy  acts  at  every  time  t,  as  though  the  present  decision  was  the 
final  one.  Conceptually,  this  approach  is  similar  to  the  "open- loop- 
fee  dback-optimal"  approach  of  control  theory,  wherein  the  present  value 
of  future  information  is  neglected.^ 

The  attractiveness  measure  M^(t)  of  a  decision  to  act  on  task  i 


+ 

The  DP  formulation  of  the  optimal  strategy  is  of  theoretical  importance 
in  its  own  right,  as  it  provides  a  general  and  flexible  analytic  frame¬ 
work  for  the  analysis  of  dynamic  decision-making  under  uncertainty. 

This  framework  covers  all  cases  where  the  present  decisions  can  affect 
future  information,  uncertainties  associated  with  the  random  processes 
of  the  system,  future  rewards  and  future  actions.  More  importantly,  it 
was  shown  in  [10]  that  the  optimal  decision  strategy  subsumes  Tulga's 
deterministic,  dynamic  sequencing  formulation  of  the  MTDP  [8],  as  well 
as  the  Markov  decision  problem  [21J,  and  several  single  processor 
sequencing  theoretic  rules  [28].  The  Markov  decision  formulation  was 
applied  and  extended  in  [11]  to  determine  stationary,  non-preemptive 
priority  policies  in  a  multi-class  queueing  system  with  finite  capacity 
and  reneging  (i.e. ,  Impatient  customers). 
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is  simply  the  subjectively  expected  discounted  value  of  reward  from  the 
first  transition  out  of  process  state  jj  regardless  of  when  it  occurs. 

It  is  given  by 

OO 

Mi(t)  -  ^  Pss'(t)  I  e"aT  rss,(T)  hss’(T)  dT  f°r  311  1  £  A(t) 

all  s'  0 

The  use  of  discount  factor,  a  in  the  computation  of  attractiveness  meas¬ 
ures,  M^(t),  may  be  interpreted  in  two  ways:  First,  it  can  account  for 
the  DM’s  present  perception  of  future  rewards.  That  is,  future  rewards 
are  worth  less  at  the  present  time  ("reward  today  is  sweeter  than  reward 
tomorrow").  Howard  [29]  calls  this  explanation  for  a,  the  "time  prefer¬ 
ence"  or  the  "greed- impatience  trade  off".  The  second  interpretation  is 
in  terms  of  the  uncertainty  associated  with  the  duration  of  the  period 
during  which  rewards  can  be  earned. 

For  the  specific  MTDP,  using  Eqs  (2.29),  (2.32)  and  (2.33),  M^t) 
can  be  rewritten  as 

Mi(t)=[r1(t)n1(t)-qi(t)u)ii(t)]fii(a;t)-  ^  qm^t)ulim<t)eim(a;t) 

m  e  A(t) 

m  i*  i 

for  all  i  e  A(t)  (2.34) 

where  B.  (ot;t)  and  C,  (a;t)  are  the  exponential  (Laplace)  transforms  of 
1  im 

the  holding  time  density  functions  b^(t) / tn^(t)  4-  W^(t)]  and  g^m(x), 
respectively.  They  are  given  by 

8i(u;t)  *  n,(t)W  '(t"r  f e‘aT  bi(T)  dT  ;  si(0;t)  “  1 

0 

and 

00 

ftim(Ct;t)  "  w~ttT  1  e~aX*im(X)  dT;  Cim(0;t)  "  ”  *  1 

Im 
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The  attractiveness  measure  associated  with  the  "do  nothing"  decision, 

M  (t) ,  or  that  of  monitoring  decision,  M  (t),  depends  on  whether  or  not 
u  m 

parallel  monitoring  is  allowed. 

Ci)  Parallel  monitoring :  When  parallel  monitoring  is  allowed, 

Mg(t)  can  be  interpreted  as  the  human's  indifference  towards, 
or  perception  of,  small  rewards.  In  the  present  context,  the 
"do  nothing"  decision  is  made  only  if  none  of  the  available 
tasks  can  be  completed,  or  if  there  are  no  tasks  to  be  pro¬ 
cessed.  We  use 


»(,<'>  -  -  £  ,»<t>  50m(C‘=t>  (2'35) 

m  e  A(t) 

where  U)QmCt)  and  are  computed  using  a  constant 

"fictitious"  processing  time  for  the  null  task,  TR^.  Thus, 

Mg(t)  represents  the  loss  due  to  disappearance  of  all  tasks. 

The  value  of  T  is  chosen  to  match  the  data,  but  is  a  con- 
RU 

stant  across  experimental  conditions  (A-D) . 

(ii)  No  Parallel  monitoring :  In  this  case,  monitoring  of  tasks 
other  than  the  one  being  acted  upon  is  not  allowed  (i.e., 
condition  B^) ,  but  monitoring  is  a  separate  valid  decision. 
Here,  we  postulate  that  the  human  makes  this  decision  only 
if  the  enhanced  knowledge  of  the  task  characteristics  off¬ 
sets  any  reward  he  may  have  gained  by  acting  on  one  of  the  N 

tasks.  That  is,  M  (t)  is  the  average  value  of  gathering  in- 
m 

formation  for  6  sec  (inegration  time  step)  starting  at  time 
t ,  and  is  given  by 


M  (t )  - 
m 


y  iM1(t-HS)-M1(t)i 
i  c  A(t) 


1 

N 


(2.36) 
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Thus,  the  monitoring  decision  is  invoked  only  if  the  infor¬ 
mation  value  is  sufficiently  high  to  preclude  action  on  one 
of  the  tasks  in  the  opportunity  window.  The  attractiveness 

measure  M  (t),  in  conjunction  with  the  measures  M.  (t),  is 
m  i 

used  to  compute  the  monitoring  probability,  P,  (t). 

am 

The  form  of  Eqs.  (2.34-2.36)  for  attractiveness  measures  is  partic¬ 
ularly  appealing,  as  it  relates  to  the  "net  gain"  of  each  of  the  task 
alternatives  available  to  the  decision-maker  at  time  t.  The  first  term 
in  Eq  (2.34)  represents  the  "potential  gain"  of  acting  on  task  i  at  time 
t,  whereas  the  summation  term  represents  the  "potential  loss"  due  to  the 
disappearance  of  all  the  other  tasks.  The  criterion  explicitly  considers 
the  human*s  inability  to  envisage  all  the  future  courses  of  actions,  as 
would  be  required  by  DP  formulation.  Moreover,  Eq  (2.34)  includes 
human* s  preference  for  rewards  that  are  distributed  in  time  via  the  dis¬ 
count  factor,  a. 

Sensitivity  analysis  of  the  DDM  (chapter  III)  has  shown  that  a 
value  of  a  =  0  gives  the  best  possible  match  to  the  data.  This  could 
imply  either  of  two  things:  First,  humans  do  not  discount  rewards 
distributed  over  a  short-time  horizon  (one  to  five  seconds  in  our  case) . 

A  second  and  more  plausible  implication  is  that  the  use  of  discount 
factor  in  the  analysis  of  dynamic  decision-making  may  be  artificial. 

That  is  to  9ay,  once  the  human  information-processing  limitations  are 
included  and  a  myopic  policy  is  postulated  for  the  human  decision  strat¬ 
egy,  it  may  not  be  necessary  to  employ  discount  factor,  a.  In  any  case, 
when  at  is  zero,  Eq  (2.34)  simplifies  to 

Mi(t)  "  ri(t)r,i(t)  '  S 

m  e  A(t) 


;  i  £  A(t) 


(2.37) 
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Thus,  there  is  no  need  to  numerically  evaluate  the  holding  time  density 
functions.  Note  that  Eq  (2.37)  is  similar  to  the  SEU  model  of  Eq  (1.4) 
with  appropriate  interpretation. 

In  summary,  the  proposed  myopic  decision  strategy  in  the  general 
case,  but  with  a  *  0,  involves  the  computation  of  only  2N(N+1)  transi¬ 
tion  probabilities  to  evaluate  the  (N+l)  attractiveness  measures,  M  (t) 

m 

and  M^(t),  i  e  A(t).  The  required  transition  probabilities  may  be  com¬ 
puted  in  a  straightforward  manner  via  Luce’s  choice  axiom.  Therefore, 
the  computational  load  of  the  proposed  decision  strategy  is  insignificant 
compared  to  that  of  the  truly  optimal  DP  formulation. 

2.6.3  Stochastic  Choice  Model 

A  decision  model  that  selects  the  task  with  maximum  attractiveness 
measure  yields  a  (1-0)  response,  and  suggests  that  the  decision-maker 
would  always  make  the  same  sequence  of  decisions  under  similar  condi¬ 
tions.  However,  it  is  well  known  [12]  that  people  fluctuate  in  their 
response  to  the  same  stimulus,  even  when  there  are  no  changes  in  their 
information  or  resources.  Fluctuations  in  choice  can  arise  because  the 
subject  is  unable  to  discriminate  precisely,  or  because  he  may  make 
calculating,  response  or  perceptual  errors.  The  stochastic  choice  models 
assume  that,  although  the  attractiveness  measures,  M^(t),  could  be 
characterized  by  a  single  fixed  number,  the  subjects  perceive  it  as  a 
random  variable,  ft^(t),  with  some  distribution  (usually  Gaussian).  The 
randomness  may  be  interpreted  in  terms  of  the  uncertainties  associated 
with  the  human  perception  of  task  values,  r^(t).  Below,  we  again  invoke 

Luce’s  choice  axiom  to  compute  the  decision  probabilities,  P,.(t): 

al 

(i)  Parallel  monitoring: 
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Pdi<t> 


1  + 


P{flk(t)-fii(t)  >  0} 


k  e  P(t)P{fii(t>A(t)  >  0} 
k  /  i 


-1 


;  i  e  V(t) 

(2.38) 


(ii)  No  parallel  monitoring : 


Pdm(t> 


P{flk(t)-Hm(t)  >  0} 


1  -1 


1  + 


k  e  A(t) 


p{flm(t)-Mk(t)  >  0} 


(2.39) 


The  decision  probabilities  are  given  by  a  relation  similar 

to  Eq  (2.38)  with  M  (t)  replacing  K  (t). 

m  o 

In  Eqs  (2.38-2.39),  we  assume  that  FL(t)  are  Gaussian  random 

2 

variables  with  mean  M.(t)  and  variance  o  (t)  that  scales 

l  Mi 

with  M^(t) .  That  is, 

aMi(t)  =  c|Mi(t)|  (c  s  .2-. 4)  (2.40) 


where  c  is  the  co-efficient  of  variation.  Note  that  the 
forms  of  Eqs  (2.38-39)  can  be  employed  with  any  decision 
s trat egy . 


2 . 7  Model  Predictions 

The  dynamic  decision  model  can  be  used  in  a  straightforward  manner 
to  generate  predictions  of  as  well  as  of  other  response  measures 

that  can  be  computed  from  the  experimental  data: 

(i)  The  completion  probability ,  Pci(t)  is  the  probability  that 
task  i  is  completed  by  time  t.  Thus, 

Pcl(t)  =  P(TR1(t)  <  0}  =  ri(0;t)  ;  i  e  A(t)  (2.41) 


When  Pc^(t)  >  -99,  the  task  is  assumed  to  be  successfully 
completed  and,  therefore,  is  removed  from  the  model. 
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(ii)  The  error  probability,  P^(t)f  is  the  probability  that  the 

human  commits  an  error,  i.e.,  starts  acting  on  a  task  he  can 
not  possibly  complete.  Thus,  P^(t)  is  the  sum  over  all  tasks 
of  the  probability  of  the  joint  event:  action  on  task  i  and 
the  time  required  to  complete  task  i  is  greater  than  the  time 


available  to  work  on  it.  Therefore, 

p,“>  ■  2  ”  °>  '  Pdi<,;> 

i  e  A(t) 


(2.42a) 


Since  T^(t)  and  Ta^(t)  are  assumed  to  be  independent  and 
conditionally  Gaussian  random  variables,  Eq  (2.42a)  becomes 


re(t) 


E  t 

m  e  A(t) 


1-Erf  (Aii)-j 

2  J 


Pdi(t) 


(2.42b) 


where  A^  and  Erf  (A^)  are  defined  following  Eq  (2.30b) 
(Hi)  The  average  accumulated  regard,  R(t),  is  the  average  total 
reward  earned  upto  the  present  time  t.  It  is  an  overall 
response  measure,  and  is  given  by 


f  x— i  dP  .(0) 

J  ri(t>  ^  (0,,_ d3 - }  d° 

0  1  e  A(t) 


(2.43) 


Civ)  Normalized  incremental  reward,  W  (t)  is  the  average  instan- 

c 

taneous  reward-earning  rate,  and  is  a  measure  of  instantaneous 

performance.  Thus,  W  (t)  is  the  weighted  sum  of  completion 

c 


probabilities  given  by 


W  (t) 
c 


k  E  V'’ 

i  e  A(t) 


(2.44) 


where  K  is  the  system  capacity  (■  5) . 
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(v)  Total  expected  tasks  completed,  Nc  can  be  computed  by 
assuming  that  all  tasks  i  e  A(t)  with  pci(t)  >  8(~  .99) 
are  successfully  acted  upon.  Thus, 

T|  _  ) 

Nc  =  J  J  2^  <5[Pc±(t)-6]  [  dt  (2.45) 

0  '  i  e  A(t)  ' 

where  <5[Pc^(t)-8]  is  the  Dirac  delta  function  and  T  is  the 
duration  of  the  experiment. 

(vi)  Average  time  spent  on  a  task  on  line  i,  Tg^  is  the  time  the 
human  attends  to  task  i  on  the  average.  It  is  given  by 

Cfi 

*,t  -  |  V‘>  d£  <2-“> 

*01 

where  t^  and  t^  are  the  times  between  which  a  task  is  on 
line  i. 

In  the  next  chapter,  model  predictions  of  the  above  response  mea¬ 
sures  are  compared  with  the  experimental  results  for  the  conditions  A,  B, 
C,  D  and  B  . 

y 

2.8  Summary 

In  this  chapter,  an  analytic  model  of  human  task  sequencing  perfor¬ 
mance  was  developed.  The  modeling  approach  borrowed  from  the  successful 
optimal  control  modeling  methodology.  The  approach  taken  here  and  in 
[10]  is  quite  general,  flexible  and  covers  all  cases  where  the  present 
decisions  affect  future  information  and  future  rewards.  As  with  the  OCM, 
the  dynamic  decision  model  (DDM)  developed  in  this  chapter  consists  of 
two  separable  blocks:  information-processor  and  decision-maker.  The 
information-processor  compensates  for  the  human's  observation  noise. 
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time-delay  and  monitoring  allocations  to  produce  the  best  linear  unbiased 
estimates  of  the  "decision  state".  The  conditional  Gaussian  statistics 
of  the  decision  state  constitute  a  sufficient  statistic  of  the  decision 
process.  The  statistics,  along  with  the  task  values,  are  used  in  a 
myopic  decision  policy,  based  on  semi-Markov  decision  process  theory,  to 
determine  the  attractiveness  measure  of  each  of  the  decision  alternatives. 
The  measures  are  subsequently  used  in  a  stochastic  choice  model,  that 
explicitly  considers  human fs  inability  to  discriminate  precisely,  to 
generate  the  decision  probabilities. 

Some  novel  features  of  our  modeling  approach  are  in  the  use  of  the 
concept  of  a  decision  state;  the  explicit  incorporation  of  human  limita¬ 
tions  at  the  information-processing  and  decision-making  stages;  and  its 
suitability  to  assimilate  new  elements  of  the  task  as  they  become  con¬ 
sidered  and  understood.  The  last  item  corresponds  to  such  issues  as 
precedence  restrictions,  resource  constraints,  general  reward  structures, 
non-s tat ionary  task  characteristics,  and  even  different  experimental 
paradigms  that  involve  the  basic  ingredients  of  monitoring,  information¬ 
processing  and  dynamic  decision-making.  Moreover,  the  model  may  be  used 
in  a  covariance  propagation  mode  or  in  a  sample  path  mode.  The  first 
mode  is  appropriate  for  model-data  validation  efforts  presented  in 
chapter  III.  The  second  mode  is  suitable  for  decision-aiding  as  discussed 
in  chapter  IV. 


j 


njr 


III.  MODEL-DATA  VALIDATION  STUDIES 


In  chapter  II,  the  dynamic  decision  model  (DDM)  of  human  task 
sequencing  performance  was  developed,  and  the  model’s  ability  to  generate 
various  response  measures  of  interest,  viz.,  P^(t)  ,Pci  ^  »?e(t) »  etc., 
was  noted.  The  present  chapter  proposes  several  metrics  for  assessing 
the  "goodness  of  fit"  (or  "similarity")  between  the  model  predictions  and 
the  experimental  data,  and  presents  results  on  the  model-data  validation 
efforts. 

3. 1  Data  analysis 

As  mentioned  in  section  1.5,  the  data  sampled  during  each  run  con¬ 
sisted  of  the  subject's  decisions,  d^(t);  the  task  completion  status, 
c^(t);  and  the  error  sequence,  e^t).  These  raw  data  were  ensemble 

averaged  to  obtain  empirical  estimates  of  the  following  response  variables: 

H 

(i)  The  decision  probability ,  P^(t),  acting  on  a  task  of  line 
i  at  time  t, 


8S  nrj 

EE  ? 

>>>  - 

i% 

j-1 


(3.1) 


where 


»  total  number  of  subjects 

N  *  total  number  of  runs  of  subject  j 
Rj 


*  -A.  -9  ‘ 


mr*  ^ 
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(ii) 


(iii) 


and 

1  if  subject  j  was  processing  a  task  on  line  i 
at  time  t  during  run  k 

0  otherwise 

fi 

The  completion  probability,  pci(t)  of  having  completed  a  task 
on  line  i  by  time  t, 


N  N_ . 

s  Rj 


pci(t)  =  j=i-=1 


(3.2) 


E» 


Rj 


J-l 


where 


il  if  subject  j  has  completed  task  i  by  time  t 


,kJ, 


ciJ(t)  =*  ^during  run  k 


(o 


otherwise 

j! 

Clearly,  P^(t)  is  a  monotonically  increasing  function  of 
time.  It  is  reset  to  zero  at  the  end  of  the  opportunity 
window  of  the  present  task  on  line  i,  i.e.,  before  the  arrival 


of  the  next  task  in  the  sequence. 


The  error  probability ,  P^Ct),  of  engaging  a  task  which  can  not 


possibly  be  completed,  was  calculated  from  the  data  via 


5  N  N„ , 
s  Rj 


P?(t) 

e 


EEE-? 


(t) 


-  1-1  -1*1  k-1 


(3.3) 


XX 

j-l 
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where 


e*j(t) 


.1  if  subject  j  was  acting  on  a  task  of  line  i 

at  time  t  during  run  k  that  can  not  be  suc¬ 
cessfully  completed 


0  otherwise 


(iv)  The  average  accumulated  reward*  R  (t)  earned  through  time  t  is 

H 

related  to  P^(t)  via  311  expression  similar  to  Eq  (2.43). 

(v)  The  normalized  incremental  reward,  W^t)  earned  by  the  human  is 

c 

given  by  an  equation  similar  to  Eq  (2.44). 

— H 

(vi)  The  nurrber  of  expected  tasks  completed*  Nc(t),  was  computed 
from 


N 


Nr 


N, 


km. 


N»(t) 

c 


Rm  ~  "Ti 

ZEII 

mpl  kgl  iml  j3l 
N 


(3.4) 


Rm 


m-1 


km 


where  c  is  as  defined  in  Eq  (3.2),  is  the  total  number 

of  tasks  that  appear  on  line  i,  and  t^^  is  the  time  at  which 
a  task  j  of  the  sequence  (i.e.,  j-th  pass)  on  line  i  reaches 
the  end  of  its  opportunity  window. 

(vii)  The  average  time  spent  on  a  task  on  line  i  that  arrived  at 
time  tg^  and  (would  have)  departed  at  time  t^  during  the 
j-th  pass  is  given  by  a  relation  similar  to  Eq  (2.46).  That 
is. 


ffij 

-  P^Ct)  dt  ;  1-1,2 . 5  ; 


01J 


J-1,2 . N, 


Ti 


(3.5) 
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where  is  as  defined  in  Eq  (3.4). 

3.2  Measures  of  Similarity 

In  order  to  assess  the  closeness  of  model  vs.  data  results  and  to 
perform  sensitivity  studies  on  the  model,  it  is  necessary  to  define 
"closeness**.  In  this  section,  we  propose  several  time-history  and  scalar 
measures  of  similarity,  which  are  subsequently  used  as  a  means  to  validate 
the  model. 

3.2.1  Time-history  Metrics 

These  measures  compare  the  ensemble-averaged  time-history  of  a 
response  variable  obtained  empirically  with  that  predicted  by  the  DDM. 
Here,  we  formulate  five  time-history  metrics  that  appear  to  be  suitable 
in  the  present  multi-task  decision  paradigm. 

(i)  The  decision  probability  comparisons  P^(t)  versus 

P^(t)ll-0.1,2 . 5. 

H 

(ii)  The  completion  probability  comparisons  P^(t)  versus 

. 5. 

(iii)  The  normalized  incremental  reward  comparisons ,  W^(t)  versus 
At).  Equivalently,  the  difference  (W^ ( t )  -  W^(t)),  or  the 
rms  ditference  W  (t)  given  by 

5  "  „  1/2 

iE 

i-i 

may  be  used  as  a  measure  of  similarity. 

— H  — ‘M 

(iv)  The  accumulated  reward  comparisons ,  Rn(t)  versus  R  (t). 

H  M 

(v)  The  error  probability  comparisons ,  P*(t)  versus  P^(t). 

3.2.2  Scalar  Metrics 

Below,  we  propose  ' lx  scalar  metrics  that  appear  to  be  pertinent 
in  the  multi-task  patadigm.  The  suggested  scalar  measures  are  useful  in 


[rl(t)  (P^(t)  -  P"i(t>)]  (3.6) 


"cr(t) 
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the  model-data  validation  studies ,  as  well  as  In  understanding  the  impact 

of  changes  In  various  model  parameters  on  the  DDM  predictions. 

(1)  Action  metric*  AM  computes  the  normalized  time  integral  of  the 

squared  error  differences  between  the  decision  probabilities 

P*  (t)  and  That  is, 

di  dl 


(3.7) 


where  T  is  the  duration  of  the  experiment.  The  square  root 
of  AM  is  a  measure  of  the  average  discrepancy  between  P^(t) 
and  P^t). 

(ii)  Incremental  Reward  Metric,  IRM  is  the  normalized  time  integral 
of  the  squared,  weighted  difference  of  the  completion  proba¬ 
bilities  P^(t)  and  P^(t)  given  by 


IRM  * 


L / fy10  ■  pci<c))] 2  dt 

-  — 

L  /  r?(t)  dt 

1-1  Jo 


(3.8) 


The  square  root  of  IRM  is  a  measure  of  the  difference  between 
the  average  reward-earning  rates  of  the  human  and  the  model, 
(iii)  Accumulated  Regard  Metric ,  ARM  is  the  normalized  time  integral 
of  the  squared  difference  between  the  average  reward  earned 
upto  that  instant  of  time  by  the  human  and  the  model.  There¬ 
fore, 
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(iv) 


(v) 


(vi) 


where  is  the  maximum  available  reward  during  the  run. 

The  square  root  of  ASM  is  a  measure  of  discrepancy  between 
the  average  overall  performance  of  the  model  and  the  human. 
Taek  Completion  Metric,  TCM  computes  the  normalized  squared 
differences  between  the  average  number  of  taaks  completed  by 
the  human  and  the  model  as 


TCM  - 


{5“-  nh 

c _ c 


(3.10) 


N 


avt 


where  Nay£  is  the  total  number  of  available  tasks  during  the 
experimental  run. 

Average  time  on  eaoh  task  metric,  ATTM  calculates  the  normal¬ 
ized  root-mean-aquared  sum  of  the  difference  between  the  times 

spent  on  each  task  by  the  human  and  the  model  according  to 
5  N_ .  /  ..  „  \2 


ATTM 


■ggfe&l 


w 


where  NTi  is  defined  in  Eq  (3.4)  and 


(3.11) 

)  is  the  initial 


(actual)  processing  time  of  a  task  on  line  i  during  the  j-th 
pass. 

Error  probability  metric,  EPM  is  the  normalized  time  Integral 
of  the  squared  differences  between  the  error  probabilities 
P^(t)  and  P^(t)  and  is  given  by 


EPM 


i  /  [*.)  -  *o] 


dt 


(3.12) 


Mote  that  the  normalized  scalar  measures  can  range  from  a 
value  of  0,  corresponding  to  a  perfect  fit  between  the  model 
and  data,  to  a  maximum  value  of  1. 
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3. 3  Model  vs.  Data  Comparisons 

The  application  of  the  DDM  to  generate  predictions  of  various 
response  measures  is  straightforward,  once  we  specify  the  parameter  set 
A  ■  (t,  p^  c,  Trq}.  From  experience  with  the  DOM,  we  choose 
x  *  human's  time-delay  =0.2  sec 

=  observation  noise-to-signal  ratio  =  0.01  (i.e.,  -20db) 
After  a  sensitivity  study  was  made  on  the  DDM,  we  selected  the  remaining 
parameters  as 


c  »  co-efficient  of  variation  *  0.3  (see  Eq.  (2.40)) 

Trq  -  "fictitious"  processing  time  *  3  sec  (see  Eq.  (2.35)) 

The  parameter  set  was  held  constant  across  experimental  conditions.  In 
all  cases,  the  subjective  values  q^  are  chosen  to  be  the  objective 
rewards  r^.  Pertinent  data  on  task  attributes,  viz.,  arrival  times, 
processing  times,  values  and  velocities,  for  the  experimental  conditions 
A,  B,  C,D  and  may  be  found  in  Ref.  [10]. 

The  five  time-history  metrics  generated  from  the  data  and  the  model 
are  compared  in  Figs.  (11-35)  for  the  five  experimental  conditions  A,  B, 

C,  D  and  By.  The  ensemble  data  were  obtained  by  averaging  over  NR  runs 
(e.g.,  NR  -  48  for  condition  A).  The  results  show  striking  similarity 
between  the  data  and  model  predictions.  The  model-data  match  is  uniformly 
good  to  excellent  for  all  the  five  experimental  conditions  studied.  This 
is  most  noteworthy  considering  that  a  nominal  set  of  parameters  were  used 
throughout,  and  that  the  decision  problem  Involved  is  complex.  To  be 
sure,  there  are  some  discrepancies,  as  in  decision  probability  (pdl) 
comparisons:  they  show  that  the  model  predictions  exhibit  rapid  varia¬ 
tions  when  compared  to  the  data.  This  discrepancy  is  likely  a  result  of 
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human  inertias,  e.g.  neuro-rauscular  lags,  decision  time  losses,  etc.  It 

can  be  corrected  by  employing  subjective  values  that  depend  on  previous 

actions,  or  by  incorporating  a  switching  cost  in  the  attractiveness 

measure  of  Eq  (2.37).  Since  the  discrepancies  were  not  major  in  terms  of 

-M  -H 

the  overall  performance  comparisons  R  (t)  vs.  R  (t) ,  and  since  our  focus 
was  on  developing  the  structure  of  human  decision  model  rather  than  the 
fine-tuning  of  it,  these  modifications  were  not  explored  in  detail. 

The  average  times  spent  on  each  task  by  the  model  and  the  human, 
along  with  the  six  scalar  measures  of  similarity  for  experimental  condi¬ 
tions  A,  B,  C,  D  and  are  displayed  in  Tables  1  through  3.  They  also 
indicate  a  reasonably  close  agreement  between  the  model  and  data  results. 


S.  Pass 

Line 

1 

2 

3 

1  1  ' 

4 

- -  ■  ■  '1 

5 

6 

1 

2.659 

(2.619) 

1.603 

(1.464) 

2. 767 
(2.607) 

0.361 

(0.321) 

3.233 

(4.155) 
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2 

5.510 

(5.333) 

4. 132 
(4. 167) 

3.619 

(3.976) 

3.763 

(3.500) 

1.922 

(1.690) 

2.454 

(2.417) 
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1.763 

(1.583) 

1.603 

(1.631) 
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(3.250) 
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(2.643) 
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(3.417) 
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1.  763 
(1.536) 
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(3.607) 

3.531 

(3.500) 

4.645 

(4.440) 

2.769 

(2.583) 

1.582 

(1.631) 

TABLE  la:  AVERAGE  TIME  SPENT  ON  EACH  TASK  IN  EACH  PASS  FOR 
CONDITION  A  (Brackets:  Data) 


SCALAR  MEASURE 

VALUE 

AM 

0.05569 

IRM 

0.05421 

ARM 

0.00018 

TCM 

0.00019 

ATTM 

0.0431 

EPM 

0.05867 

TABLE  lb:  SCALAR  MEASURES  OF  SIMILARITY  FOR  CONDITION  A 
(A  Value  of  0  Corresponds  to  a  Perfect  Fit) 
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1.762 

(1.615) 
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(4.385) 
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1.585 
(3.  385) 

3.473 

(3.104) 

1.427 
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(3.573) 

3.379 
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1.  364 
(1.740) 

3.542 

(2.906) 

1.376 

(1.677) 

2.409 

(2.573) 

2.376 

(2.510) 

1.376 

(0.917) 

(  -  ) 

TABLE  2a:  AVERAGE  TIME  SPENT  ON  EACH  TASK  IN  EACH  PASS  FOR  CONDITION  B 
(Brackets:  Data) 
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SCALAR  MEASURE 

VALUE 

AM 

0.06656 

IRM 

0.09250 

ARM 

0.00019 

TCM 

0.028xl06 

ATTM 

0.06706 

EPM 

0.00933 

TABLE  2b:  SCALAR  MEASURES  OF  SIMILARITY  FOR  CONDITION  B 
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(0.405) 

3.665 

(3.270) 

3.613 

(3.608) 

0.040 

(0.514) 

1.306 
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0.166 

(0.716) 

3.738 

(3.608) 

3.288 

(3.527) 

3.672 

(3.541) 

0.491 

(1.338) 

0.030 

(0.230) 
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(1.149) 

(  -  ) 
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0.023 

(0.284) 
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TABLE  '3a:  AVERAGE  TIME  SPENT'  ON  EACH  TASK  IN  EACH  PASS  FOR  CONDITION  C 
(Brackets:  Data) 


SCALAR  MEASURE 

VALUE 

AM 

0.06676 

IRM 

0.09269 

ARM 

0.00045 

TCM 

0.00370 

ATTM 

0.07856 

EPM 

0.00200 

TABLE  3b:  SCALAR  MEASURES  OF  SIMILARITY  FOR  CONDITION  C 
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(1.688) 

2.783 

(2.688) 

3.407 
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0.  392 
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TABLE  5a:  AVERAGE  TIME  SPENT  ON  EACH  TASK  IN  EACH  PASS  FOR  CONDITION  B 
(Brackets:  Data)  ^ 


SCALAR  MEASURE 

VALUE 

AM 

0.06105 

IRM 

0.06944 

ARM 

0.00066 

TCM 

0.00094 

ATTM 

0.07897 

EPM 

- ! 

0.00132 

TABLE  5b:  SCALAR  MEASURES  OF  SIMILARITY  FOR  CONDITION  B 
-  y 
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3*4  Sensitivity  Analysis  of  the  PPM 

Sensitivity  studies  were  made  on  the  DDM  with  respect  to  the  para¬ 
meter  set  Q*  The  study  showed  that  the  model  predictions  exhibit  greater 
sensitivity  to  the  parameter  c,  the  co-efficient  of  variation,  than  to 
the  remaining  parameters  T,  p  ,  TRQ.  Therefore,  only  the  results  of 
varying  the  parameter  c  are  presented  in  detail  for  experimental  condi¬ 
tions  B  and  D,  and  results  for  the  other  parameters  and  the  discount 
factor,  a  are  briefly  summarized. 

(i)  Variations  of  ao-efficient  of  variation ,  c:  The  parameter 
c  was  varied  in  the  range  0.1  -  1.0  and  the  model  predic¬ 
tions  of  percent  reward  earned,  percent  tasks  completed  and 
the  scalar  measures  of  similarity  are  plotted  in  Figs.  36 
and  37  for  the  experimental  conditions  D  and  B,  respectively. 
As  the  value  of  c  increases,  the  percent  reward  earned  and 
the  percent  tasks  completed  by  the  model  decreases.  This 
is  because  the  model  allocates  attention  equally  among  tasks 
Lt  high  values  of  c.  This  results  in  a  reduction  in  the 
number  of  tasks  being  completed  and,  hence,  the  reward,  since 
the  value  is  credited  only  at  the  end  of  a  successful  task 
completion.  The  tendency  of  the  model  to  uniformly  allocate 
attention  among  tasks  at  large  values  of  c,  causes  a  decrease 
in  the  measure  AM.  Her 'ever,  all  the  other  measures  of 
similarity,  IRM,  ARM,  ATTM  and  EPM,  generally  increase  with 
Increasing  c.  Note,  in  particular,  that  ARM,  which  is  a 


■f* 

The  measure  TCM  is  not  shown,  as  it  is  similar  to  comparing  percent 
tasks  completed  by  the  human  and  the  model. 


COEFFICIENT  OF  VARIATION 

MEASURES  VERSUS  CO-EFFICIENT  OF  VARIATION  FOR  EXPERIMENTAL  CONDITION  D 
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36:  SCALAR  MEASURES  VERSUS  CO-EFFICIENT  OF  VARIATION  FOR  EXPERIMENTAL  CONDITION  D 


MEASURES  VERSUS  CO-FFFICIFfIT  OF  VARIATION  FOR  EXPERIMrNTAL  CONDITION  B 


measure  of  overall  performance  of  the  model,  exhibits  good 
sensitivity  to  c  when  compared  to  IRM,  which  is  a  measure  of 
incremental  performance.  Overall,  the  results  indicate  that 
a  value  of  c  in  the  range  0.3  +  0.1  gives  a  good  fit  to  the 
experimental  data. 

Variations  of  time-delay ,  T:  As  time-delay  increases,  the 
uncertainty  associated  with  the  estimation  of  the  decision 
state  increases.  This,  in  turn,  leads  to  a  smaller  number 
of  tasks  being  completed,  and  smaller  reward  being  earned. 
The  measures  AM  and  IRM  were  found  to  be  relatively  insensi¬ 
tive  (within  10  percent)  to  time-delay  variations  in  the 
range  0.15  -  0.50  sec,  wheras  ARM  was  quite  sensitive  to  X • 
Also,  the  measures  ATTM  and  EPM  exhibited  modest  increases 
with  time-delay.  The  overall  results  indicated  that  a  value 
of  !  in  the  range  0.20  +  0.05  sec  is  the  best  choice,  a 
range  consistent  with  that  employed  in  the  0CM. 

Variations  of  discount  factor ,  a:  As  ot  increases,  the  model 
allocates  attention  to  tasks  with  small  processing  times. 
This  results  in  a  decrease  of  total  reward  earned,  although 
the  number  of  tasks  (of  less  value)  completed  may  increase. 
The  measures  AM,  IRM  and  ATTM  generally  increase  with  a, 
whereas  the  overall  measure  ARM  is  insensitive  (within  10 
percent)  to  variations  in  the  discount  factor.  Overall,  a 
value  of  a  z  0  was  found  to  give  the  best  possible  match  to 
the  data.  Therefore,  the  parameter  a  was  discarded  from 


the  model. 
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(iv)  Variations  of  observation  noise  ratio >  p^:  The  model 
response  was  relatively  insensitive  (within  10%)  to  observa¬ 
tion  noise  ratio  in  the  range  -15  db  to  -25  db.  However, 
the  results  showed  some  perplexing  trends.  At  high  values 
of  p^.  (i.e.,  less  negat ive)^  the  model  earned  more  reward 
and  completed  more  number  of  tasks  than  at  low  values  of 
Therefore,  the  measures  LRM  and  ARM  decrease  with  increases 
In  pj,  but  the  measure  AM  appears  to  increase  slightly.  This 
apparent  anomaly  may  he  due  to  complex  interaction  between  p^ 
and  the  co-efficient  of  variation,  c. 

(v)  Variations  of  "fieti  tious"  processing  time ,  T  ;  As  T 

KU  KvJ 

increases,  the  attractiveness  measure,  (t)  becomes  more 

negative.  This  reduces  the  "do  nothing"  probability,  P^Ct) 

and  results  in  a  non-decreasing  total  reward.  A  value  of 

T  -  3  to  5  sec  was  found  to  be  a  reasonable  choice  in  the 
KO 

present  experimental  context. 

The  above  sensitivity  results  show  that  the  choice  of  the  parameter 
set  Q  is  not  critical,  at  least  within  a  reasonable  range  of  variation. 
However,  future  research  could  determine  whether  or  not  the  parameter  set 
remains  constant  with  modified  decision  paradigms,  such  as  those  suggested 
in  section  4.1. 

3 . 5  Comparison  with  other  Decision  Models 

Since  the  decision  situation  basically  involves  dynamic  sequencing 
of  tasks  under  uncertainty,  a  logical  question  is:  "Couldn't  we  have 
used  one  of  the  many  sequencing  rules  that  appear  in  the  scheduling 
literature  [28]  to  model  human  decision  strategy  as  effectively  as  the 
DDM?"  In  this  section,  we  answer  this  question  in  the  negative  by 
comparing  the  DDM  with  four  heuristic  sequencing  rules  of  scheduling 
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theory.  We  also  contrast  DDM  with  two  other  decision  rules ,  which  may 
be  interpreted  as  special  cases.  The  results  illustrated  here  are  for 
condition  D  only,  but  they  are  representative  of  the  other  conditions  as 
well. 

3.5.1  Comparison  with  Heuristic  Sequencing  Rules 

The  following  four  decision  rules  were  selected  for  comparison 
with  the  DDM: 

(i)  Weighted  shortest  remaining  processing  time  (WSRPT)  rule : 

At  any  time  t,  this  rule  chooses  a  task  with  maximum 

[r . (t) |T  (t)  ] .  Some  advantages  of  WSRPT  rule  are:  (a) 
i  l  Ki 

it  minimizes  the  weighted  completion  times  as  well  as  the 
weighted  waiting  times  of  tasks  being  sequenced,  and  (b) 

Lt  does  not  require  any  look-ahead  features,  even  though 
tasks  become  available  intermittently,  i.e.,  it  is  a 
dispatching  decision  rule.  The  major  drawbacks  of  this 
rule  are:  (a)  it  stipulates  a  (1,0)  type  of  decisions 
and  does  not  consider  randomness  in  human  response;  (b) 
it  does  not  take  into  account  the  time  available  to  work 
on  a  task,  although  it  does  minimize  average  lateness  of 
tasks  (if  allowed  to  work  even  after  deadline  has  ex¬ 
ceeded);  (c)  It  assumes  that  T  (t)  is  deterministic;  and 

Ki 

(d)  it  discriminates  among  tasks  to  the  greatest  possible 
extent,  resulting  in  increasingly  excessive  waiting  time 
for  low  priority  tasks.  The  first  two  cited  limitations 
of  WSRPT  are  removed  in  the  decision  rules  (ii)  and  (iii), 
respectively. 

(ii)  WSRPT  with  stochastic  choice :  This  rule  is  similar  to 
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(i),  except  that  it  employs  Luce's  choice  axiom  to  render 
the  decision  rule  random  as  in  the  DDM. 

(iii)  Modified  WSRPT:  At  any  time  t,  this  rule  selects  a  task 
with  maximum  [r  (t)  | TRi (t)  ] •  u[Tai(t)  -  1^(0],  where  u(’) 
is  a  unit  step  function.  This  rule  is  similar  to  (i),  but 
takes  into  consideration  the  time  available  to  work  on  a 
task  via  a  unit  step  function  involving  slack  time, 

«.!<*>  -  TRl<t))' 

(iv)  Weighted  Slack  time  (WST)  rule:  At  any  time  t,  this  rule 

selects  a  task  with  maximum  [r  (t)|(T  .  (t)  -  T  ( t ) )  ]  •  This 

x  ai  Hi 

scheme  is  often  used  with  WSRPT  sequencing  to  overcome  the 
limitation  (d)  of  WSRPT  rule. 

Table  6  compares  DDM  performance  with  those  of  the  heuristic  se¬ 
quencing  rules  (i)  -  (iv)  via  the  scalar  measures  of  similarity  for  the 
experimental  condition  D.  The  figures  in  brackets  display  the  percent 
decrement  in  performance  of  a  heuristic  sequencing  rule,  using  measures 
for  DDM  as  a  base.  The  results  clearly  indicate  that  the  performance  of 
DDM  is  significantly  better  than  the  sequencing  rules  (i)  -  (iv).  It 
should  also  be  noted  that  WSRPT  rule  with  stochastic  choice  does  better 
than  a  pure  WSRPT,  thereby  confirming  randomness  in  human  decision  be¬ 
havior,  as  well  as  the  inadequacy  of  Monte  Carlo  models  of  the  type 
espoused  by  Tulga  {8J.  These  results  also  cast  doubt  on  models  that 
assume  perfect  human  perception  of  the  task  attributes, 

3.5,2  Comparison  with  Related  Decision  Models 


Several  decision  models  that  are  derivatives  of  the  DDM  were 
studied;  two  particularly  interesting  ones  are  discussed  here. 
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WST:  Weighted  slack  time 
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(i)  Related  tnodel  1:  This  model  assumes  that  the  subjective 

looses,  q^(t),  are  zero.  Thus,  the  attractiveness  measures 
of  Eqs.  (2.35)  and  (2.37)  become,  respectively 


MQ(t)  =  0 

Mi(t)  =  T±(t)  (t)  ;  1  e  ACt) 


(3.13) 


(ii)  Related  model  2:  In  this  model,  the  attractiveness  measures 
are  given  by 


V‘>  -  -  £  rj(t)  P|TaJ<t)  iTao! 


jeA(t) 


(3.14) 


Mt(t)  -  rl(t)  pjlR1(t)  <  tai(t)J  -  2  »|  V*>  <  V<C>! 

jcA(t) 


This  model  may  be  derived  from  Eqs  (2.35)  and  (2.37)  by 

letting  all  the  available  times  T  (t)  *  00 ,  nuM  while  com- 

am 

puting  U)  ,  and  setting  T  «  °°,  i^j  in  evaluating  n,(t)« 
ij  aj  i 

The  form  of  the  attract iveness  measures  in  Eq  (3.14)  is 
similar  to  those  of  "information-integration  rules"  of  be¬ 
havioral  decision  theory  [2].  A  notable  feature  of  this 
model  is  that  it  affords  simple  computation,  and  does  not 
require  any  (numerical)  approximations  in  its  evaluation. 

The  results  of  Table  7  show  that  the  simplified  models  perform 
almost  as  well  as  the  DDM.  Model  1  matches  the  data  well  with  respect 
to  measures  AM  and  IRM,  but  performs  poorly  with  respect  to  error  pro¬ 
bability  measure,  EPM.  In  fact,  this  is  what  motivated  us  to  include 
the  subjective  losses  in  the  attractiveness  measures. 


L 
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Table  7 :  COMPARISON  OF  DDM  WITH  RELATED  DECISION  RULES 
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3. 6  Summary 

This  chapter  described  the  results  on  model-data  validation  efforts. 

In  order  to  validate  the  model,  several  time-history  and  scalar  measures 
of  similarity  between  the  model  predictions  and  the  experimental  data  were 
proposed.  The  model-data  validation  effort  consisted  of  comparing  the  time- 
history  metrics,  such  as  the  decision  probabilities,  completion  proba¬ 
bilities,  incremental  reward,  accumulated  reward  and  error  probability. 
Validation  on  the  basis  of  scalar  measures  consisted  of  comparing  the 
average  time  spent  on  each  task,  the  difference  between  the  incremental 
and  accumulated  rewards  of  the  model  and  data,  etc. 

When  viewed  in  total,  the  model-data  comparisons  for  all  the  cases 
studied  are  excellent.  This  is  achieved  with  a  simple,  intuitively 
appealing  decision  model,  using  a  nominal  set  of  parameters  throughout. 

To  be  sure,  there  are  some  discrepancies,  as  in  decision  probability 
comparisons.  However,  these  mismatches  are  not  major,  and  can  be 
corrected  by  minor  model  modifications.  The  model  predicted  trends 
generally  agree  with  the  data. 

Sensitivity  analysis  of  the  DDH  has  shown  that  the  choice  of  the 
parameter  set  is  not  critical,  at  least  within  a  reasonable  range  of 
variation.  The  performance  of  DDM  was  contrasted  with  those  of  several 
heuristic  sequencing  rules  of  scheduling  theory,  as  well  as  some  related 
decision  models.  The  results  point  to  the  clear  superiority  of  the  DDK  in 
representing  human  task  sequencing  performance. 


IV.  DISCUSSION  AND  EXTENSIONS 


The  primary  purpose  of  this  research  has  been  to  gain  an  understand¬ 
ing  of  human  information-processing  and  task  selection  procedures  in  dy¬ 
namic  multi-task  environments.  The  approach  has  been  to  combine  the 
results  of  a  joint  analytic  and  experimental  program  into  a  normative 
dynamic  decision  model  (DDM)  of  human  task  sequencing  performance.  To 
this  end,  a  general  multi-task  paradigm  was  developed  that  retains  the 
essential  features  of  human  task  selection  in  a  manageable,  yet  manipula¬ 
tive,  context.  Via  this  framework,  we  have  studied  the  effects  of  various 
task  related  variables  on  the  human  decision  processes.  The  model  that 
has  emerged  from  this  effort  could  form  a  small,  but  significant,  step 
towards  human  modeling  in  complex  supervisory  control  systems.  In  the 
following,  several  suggestions  for  further  research  are  given.  These 
include  mode L  refinements,  model  application  to  decision-aiding  and  the 
modeling  of  raulti-huraan  decision-making  in  multi-task  systems. 

4. 1  Modifications  of  the  Decision  Paradigm 

The  concept  of  a  decision  state  is  fundamental  to  our  analytic 
modeling  approach.  The  human’s  decision  strategy  depends  directly  on  his 
estimates  of  the  decision  state,  once  the  task  values  and  a  performance 
metric  are  given.  In  the  present  experimental  context,  the  decision  state 
Is  related  to  the  task  state  via  a  simple  functional  transformation.  Also, 
the  tasks  are  assumed  to  be  Independent  and  the  task  values  are  constant 
as  the  bar  moves  across  the  CRT  screen.  This  simplicity  in  the  experimen¬ 
tal  paradigm  enabled  us  to  develop  the  DDM  by  focusing  on  the  underlying 
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structural  aspects  of  the  human  decision-processes,  without  the  attendant 
task  complexities.  However,  the  future  tests  of  DDM  should  consider  more 
intricate  task  structures,  such  as  those  involving  non-s tat ionary  task 
attributes,  task  dependency  and  resource  constraints.  These  and  other 
extensions  are  described  below: 

(i)  Non- stationary  task  attributes :  In  many  realistic  situations, 
the  task  attributes  (e.g.,  value  and  velocity)  may  evolve  in 
time,  or  they  may  vary  as  a  function  of  human's  decisions. 

For  example,  a  AAA  gunner  who  fires  at  an  enemy  target  may 
find  that  the  target  has  changed  course  and  is  diving  towards 
the  gunner's  position.  This  results  in  a  change  of  target's 
value  and  the  time  available  to  engage  the  target.  The  pres¬ 
ent  experimental  paradigm  can  be  modified  to  include  time- 
varying  task  characteristics  in  a  straightforward  manner. 
However,  the  analytic  framework  of  the  model  is  valid  almost 
in  toto  for  this  case. 

(ii)  Task  dependency.  Since  the  subsystems  are  interconnected 
physically  in  a  complex  system,  the  tasks  are,  in  general, 
correlated.  This  correlation  may  assume  the  form  of  preced¬ 
ence  relations  and/or  dependency  among  the  attributes  of 
different  tasks.  Precedence  relations  pertain  to  the  exist¬ 
ence  of  technological  restrictions  on  the  task  sequence,  or 
the  partial  ordering  among  the  tasks.  The  precedence  rela¬ 
tions  generally  take  the  form  of  an  assembly  tree  or  a 
branching  tree.  A  relevant  example  of  such  a  situation  is 
the  problem  of  multi-RPV  control,  where  some  RPVs  (e.g., 

ECM)  must  be  brought  over  the  target  area  before  the  others. 


This  situation  can  be  incorporated  into  the  experimental 
paradigm  by  not  allowing  the  subject  to  engage  certain  tasks 
until  he  has  successfully  completed  their  prerequisite  tasks. 
In  this  case,  the  analytic  modeling  of  the  decision  process 
involves  a  two  step  procedure  in  which  sequencing  phase  is 
preceeded  by  a  labeling  phase  that  identifies  feasible  action 
subsets.  Thus,  only  the  set  of  feasible  decisions,  P(t), 
along  with  any  human  limitations  (e.g,,  loss  of  decision  time 
in  the  labeling  phase),  need  to  be  identified.  On  the  other 
hand,  the  task  correlations  due  to  dependency  among  the 
attributes  of  different  tasks  can  be  incorporated  in  the  form 
of  coupled  subsystems  in  a  state  space  formulation.  This 
will  undoubtedly  increase  the  computational  complexity  of  the 
model.  Hopefully,  only  a  small  number  of  tasks  have  such 
interactions . 

(Hi)  Resource  constraints :  In  practice,  resources,  such  as  fuel 
and  ammunition,  are  finite.  Since  the  availability  of 
resources  has  implications  in  the  human  decision-making 
processes,  future  research  should  Investigate  human  decision 
behavior  with  resource  constraints.  In  the  present  experi¬ 
mental  paradigm,  a  displayed  resource  may  be  the  total  time  a 
subject  can  expend  in  processing  tasks.  This  research  could 
delineate  the  nature  of  differences  in  human  behavior  under 
constrained  and  unconstrained  situations. 

(iv)  A  Related  Paradigm :  Although  the  present  experimental  para¬ 
digm  is  well-suited  to  understand  the  basic  issues  of  a  multi 
task  decision  problem,  it  is  far  too  abstract  to  be  of  use  in 
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a  specific  application,  A  means  to  overcome  this  limitation 
and,  at  the  same  time,  be  close  to  the  manual  control  para** 
digms  is  to  design  an  experimental  situation  wherein  human 
interacts  simultaneously  with  several  dissimilar  dynamic 
processes.  The  task  characteristics  can  be  manipulated  by 
varying  the  nature  and  occurrence  of  disturbances  acting  on 
the  processes.  This  experimental  paradigm  is  ideally  suited 
to  study  all  the  issues  of  a  multi-task  decision  problem,  viz., 
task  detection,  task  sequencing  and  task  implementation.  The 
conceptual  framework  of  the  DDM  is  still  valid.  The  modeling 
process  poses  interesting,  albeit  solvable,  control  theoretic 
problems . 

4 . 2  Computer-aided  Decision-making 

With  rapid  advances  in  technology  and  higher  levels  of  automation, 
computers  are  increasingly  being  used  in  decision-making  situations.  If 
the  computer  is  to  be  accepted  by  human  as  a  decision-aid,  or  if  decision¬ 
making  responsibility  is  to  be  allocated  between  human  and  computer,  then 
there  must  exist  a  symbiotic  relationship  between  the  two.  Computer-made 
decisions  and/or  information  displays  should  be  compatible  with  human 
processing  goals,  implying  that  the  computer  would  require  a  model  of  the 
human!  Successful  interaction  between  human  and  computer  could  reduce 
human  work  load,  increase  probability  of  correct  decisions  and  reduce 
system  risk. 

The  DDM  developed  in  chapter  II  is  used  in  a  covariance  propagation 
mode  to  predict  ensemble  or  averaged  statistics  of  human  response.  How¬ 
ever,  for  decision-aiding  applications,  one  needs  a  Monte  Carlo  (or 
sample-path)  simulation  of  the  model.  The  Implementation  of  the  sample- 
path  version  of  the  DDM  is  similar  to  that  of  OCM  [31].  That  is,  the 
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model  mimics  the  human  actions,  complete  with  random  number  generators 
that  reproduce  inherent  human  randomness .  The  simulation  generates  time- 
histories  of  human  decisions  in  response  to  any  given  task  arrival  pattern. 
Using  a  Monte  Carlo  model,  one  can  study  the  potential  application  of 
computer-aiding  at  various  levels  as  outlined  below: 

(i)  Information-processing  mode:  In  this  mode,  the  computer, 
using  a  model  of  the  human  or  its  own  internal  model,  displays 
information  relevant  to  decision-making.  The  displayed  infor¬ 
mation  can  be  of  various  types:  an  assessment  of  the  present 
and,  possibly,  future  task  states  (Mraw  data")  or  of  the 
decision  states  (" reduced  data") ;  or  the  detection  of  new 
tasks  while  human  is  attending  to  a  task.  Note  that,  in  this 
mode,  the  computer  provides  information  at  the  pre-decision 
level.  If  this  type  of  aiding  is  to  be  effective,  the  infor¬ 
mation  must  be  accessible  in  real  time,  and  it  must  reduce 
memory  load  of  the  human. 

(ii)  Decision-prompting  mode:  In  this  mode,  the  model  provides  the 
human  with  guidelines  for  making  a  decision  so  that  he  can 
concentrate  on  few  vital  decision  alternatives.  The  comput¬ 
erized  model  may  be  exercised  to  rank-order  the  importance  of 
various  decisions  via  the  attractiveness  measures.  These 
metrics  are  used  only  as  prompting  information  with  the  DM 
free  to  select  any  of  the  alternatives.  If  the  model  is 
truly  representative  of  human  decision-processes,  there  should 
be  high  correlation  between  human  and  computer  decisions. 
Moreover,  this  mode  of  aiding  may  be  used  to  investigate  the 
human '8  ability  to  detect  decision  blunders  by  the  computer, 
and  it  may  answer  the  Important  question:  Should  a  machine.  In 


order  to  help  or  replace  us,  act  like  us? 

(iii)  Decision- sharing  mode :  In  situations  where  the  human  poten¬ 
tially  encounters  more  tasks  than  he  can  satisfactorily  per¬ 
form,  allocation  of  decision-making  responaibility  between 
human  and  computer  may  be  the  best  mode  of  human-computer 
interaction.  In  order  that  the  human-computer  interaction  be 
efficient,  the  actions  of  the  computer  must  be  transparent  to 
the  human,  and  the  computer  should  be  able  to  infer  what  the 
human  is  doing.  Thus,  a  model  of  the  human  allows  for  covert 
communication  between  the  human  and  computer,  and  reduces  the 
need  for  overt  communication.  Moreover,  a  model  of  the  human 
can  be  used  to  predict  future  courses  of  action  by  the  human 
so  that  the  computer  can  strive  to  avoid  them.  This  results 
in  a  reduction  of  conflicts,  a  particularly  desirable  feature 
under  high  work  load  situations. 

4 . 3  Multiple  Human  Decision-makers 

The  study  of  a  multi-task  system  with  multiple  decision-makers  can 
be  approached  at  various  levels  of  complexity.  The  analytic  framework  of 
the  ODM  can  be  extended,  at  least  conceptually,  to  a  centralized  decision¬ 
making  system  in  which  tasks  arrive  at  a  central  supervisor  who,  in  turn, 
routes  them  to  various  subordinate  decision-makers.  The  individual 
decision-makers  have  the  responsibility  of  sequencing  tasks  in  their 
respective  queues.  The  overall  decision-process  involves  finding  a  global 
routing  strategy  for  the  supervisor  and  local  sequencing  strategies  for 
the  individual  subordinates,  taking  into  account  inherent  and  interhuman 
randomness , 

A  more  realistic  and  challenging  problem  is  the  modeling  of  multiple 
DMs  in  distributed  multi-task  systems.  Here  tasks  arrive  at  each  Individ- 
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ual  DM.  An  individual  DM  has  to  determine  whether  to  keep  an  arriving 
task  for  himself  or  send  it  to  someone  else;  and  which  task,  if  any,  he 
should  process.  Thus,  the  decision-process  requires  the  specification  of 
a  local  routing  strategy  and  a  local  sequencing  strategy  for  each  DM.  The 
decision  process  is  affected  by  the  communication,  information-pattern  at 
each  DM,  hierarchical  structures,  inter-human  randomness  and  variability, 
to  name  but  a  few. 

4. 4  Summary 

In  this  chapter,  we  have  delineated  three  logical  extensions  of  the 
present  research.  The  first  relates  to  exercising  the  model  in  more  com¬ 
plex  multi-task  situations  such  as  those  involving  non-s tat ionary  task 
attributes,  task  dependency  or  resource  allocation  constraints.  This 
research  serves  to  refine  and  validate  the  DDM.  The  second  extension 
seeks  to  use  the  model  for  studying  computer-aided  technology.  In  this 
context,  three  modes  of  interaction  between  the  human  and  computer  are 
identified,  viz.,  the  information-processing  mode,  the  decision-prompting 
mode  and  the  decision-sharing  mode.  It  was  concluded  that  in  all  modes 
of  operation,  computer  must  have,  as  a  reference,  an  internal  model  of  the 
human  for  effective  man-computer  interaction.  Finally,  the  third  exten¬ 
sion  relates  to  developing  models  suitable  for  multi-task  systems  with 
multiple  decision-makers.  This  research  poses  problems  of  immense 
analytic  difficulty,  but,  if  solved,  will  be  extremely  useful  in  under¬ 
standing  the  human  component  of  a  complex  supervisory  control  system.  It 
is  hoped  that  future  contributions  will  be  along  these  lines. 


APPENDIX  A 


LUCE’S  CHOICE  AXIOM 

The  observed  inconsistency  and  uncertainty  associated  with  human 
decision  behavior  have  led  to  two  classes  of  probabilitistic  choice 
models.  These  are  the  random  utility  models  and  the  constant  utility 
models.  The  random  utility  models  (called  the  "discriminable  dispersion 
models'1  by  psychologists  and  "probit  analysis"  by  statisticians)  assume 
that  the  utility,  or  the  value,  of  each  alternative  is  intrinsically 
variable  at  the  subjective  level,  and  that  the  alternative  with  the 
highest  momentary  value  is  chosen.  Thus,  in  these  models,  the  uncertain¬ 
ty  in  choice  is  attributed  to  the  randomness  in  utility.  The  constant 
utility  models,  on  the  other  hand,  assume  that  the  value  assigned  to  each 
of  the  alternatives  is  fixed,  but  that  the  choice  is  a  probabilistic 
function  of  these  values.  Here,  the  randomness  in  choice  is  attributed 
to  uncertainty  in  the  decision  rule.  Although  these  two  types  of  choice 
representation  are  very  different  in  psychological  terms,  they  are  some¬ 
what  compatible  in  mathematical  terms.  This  is  because  some  forms  of 
probabilistic  choice  models  can  be  interpreted  as  either  random  or  con¬ 
stant  utility  models  [24, 25]. The  random  utility  models  have  their  origins 
in  the  works  of  Thurstone  on  psychophysical  scaling  [31]  and  later  Block 
and  Marschak  on  probabilistic  theories  of  response  [32],  whereas  the 
constant  utility  models  have  largely  been  influenced  by  Luce's  choice 
axiom  [24-27], 

Luce's  choice  axiom  is  a  probabilistic  formulation  of  Arrow's  [33] 
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famed  Mlaw  of  irrelevant  alternatives".  The  axiom,  in  essence,  says  that 
our  preferences  between  two  alternatives  (stimuli)  do  not  change  when 
other  alternatives  are  added  to,  or  discarded  from,  the  overall  set  of 
alternatives.  The  axiom  has  been  invoked  implicitly  or  explicitly,  in 
psychophysical  scaling,  utility  theory,  decision  theory,  stochastic 
learning  theory  and  in  many  psychometric  models.  This  is  because  of  its 
simplicity  and  the  resulting  computational  attractiveness.  In  the  fol¬ 
lowing,  the  axiom  is  formally  stated  and  its  implications  for  developing 
a  stochastic  choice  model  are  discussed. 

A. 1  Notation  and  Preliminaries 

Let  T  «  {x,y,z...}  denote  a  finite  set  of  independent  alternatives 
(e.g.,  x  is  the  minimum  value  of  some  random  variables  associated  with  a 
process  state  transition,  x  is  the  maximum  attractiveness  measure,  etc.). 
We  use  A , B , C , . . .  to  denote  the  non-empty  subsets  of  T.  We  let  P^(x) 
represent  the  probability  of  choosing  an  alternative  x  when  only  the  sub¬ 
set  A  of  alternatives  is  offered  to  the  DM.  The  usual  probability 
axioms  are  assumed  to  hold  for  all  A.  Clearly,  ( x)  is  the  probability 

of  selecting  x  when  the  entire  set  T  is  presented  to  the  DM  and 

pt(a)  =  pT<x> 
xeA 

For  brevity,  we  let  P(x:y)  denote  ^j(x),  the  probability  that  the  DM 
selects  x  when  asked  to  choose  between  x  and  y.  Also,  we  assume  that 
P(x:x)  =  -|* 

A. 2  Choice  Axiom 

The  choice  axiom,  in  essence,  states  that  the  removal  of  some  alter¬ 
natives  does  not  alter  the  relative  probabilities  of  choice  among  the 
remaining  alternatives.  That  is,  the  presence  or  absence  of  an  alter- 
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native  is  irrelevant  to  the  relative  probabilities  of  choice  between  two 
other  alternatives,  although  the  absolute  values  of  these  probabilities 
will  generally  be  affected.  Formally,  for  all  x  e  ACT 


PA(x)  =  Pr(x/A) 


whenever  the  conditional  probability  exists. 


(A.  1) 


The  choice  axiom  says  that  the  choice  from  the  subset  A  is  independ¬ 
ent  of  what  else  may  have  been  available.  In  other  words,  even  when  the 
entire  set  T  is  offered  to  the  DM,  if  we  only  look  at  those  occasions 
when  the  choices  are  made  from  the  subset  A,  then  the  probability  of 
selecting  x  from  A,  Pt<x/A),  is  identical  to  the  probability  of  selecting 
x  from  A,  P^(x),  when  only  the  subset  A  was  presented  to  the  DM  in  the 
first  place. 

By  the  definition  of  conditional  probability, 

P  (x,A)  P  (x) 
pt(x/a)  =  P^(A)  =  p^y 

Eq.  (A.l)  can  be  rewritten  as 


PT(x)  =  Pt(A)  •  PA(x) 


(A. 2a) 


PT(x) 
PA(X)  =  Pt(A) 


(A. 2b) 


Eq.  (A. 2)  provides  an  alternate  interpretation  of  the  choice  axiom.  It 
says  that  the  overall  probability  of  choosing  an  element  x  from  the  set 
T,  PT(x),  may  be  viewed  as  a  multi-stage  process.  First,  the  probabiility 
of  choosing  A  from  T,  PT(A),  is  estimated,  and  then  the  probability  of 
choosing  x  from  A,  PA00»  is  computed.  Note  that  the  subset  A  may  be 
subdivided  a  number  of  times  until  a  single  element  x  remains.  More¬ 
over,  the  axiom  implies  that  the  product  P,j,(A)  •  P^(x)  is  independent 
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of  the  way  in  which  T  is  partitioned  into  subsets!  Clearly,  intuition 
suggests  that  the  axiom  can  not  be  expected  to  hold  in  complex  inter¬ 
dependent  situations.  We  will  discuss  the  limitations  of  the  axiom 
later. 

Below,  we  prove  some  trivial  consequences  of  the  choice  axiom  as  a 
prelude  to  deriving  a  stochastic  choice  model. 

Lemma  1 

Suppose  that  the  choice  axiom  holds  for  all  A,  x  e  A  C  T. 

(1)  If  PT(x)  +  0,  then  P^(x)  +  0 

(Ii)  If  PT(x)  *  0  and  PT(A)  ±  0,  then  ?A(x)  -  0 

(iii)  If  PT(y)  «  0  and  y  4  x,  then  PT(x)  -  PT-{y}^X) 

Proof 

(i)  Since  x  e  A,  PT(x)  ±  0  implies  PT(A)  ^  0. 

PT(x) 

Thus,  PA(x)  «  P  (x/A)  =  V'TaT  **  °‘ 

A  T  VA)  P  (x) 

(ii)  Since  PT(x)  =  0  and  PT(A)  4  0,  PA(x)  *  p~(A)  * 

(iii)  PT(y)  *  0  implies  PT(T-{y})  =  1 

Using  this  and  the  fact  that  x  ^  y,  we  have 

PT(x/T-{y})  =  PT(x) 

By  the  chpice  axiom 

PT(x/T-{y})  =  PT_^yj(x)  =  PT(x) 

The  result  (iii)  shows  that  an  alternative  that  Is  never  chosen  may  be 
removed  from  the  set  without  affecting  the  choice  probabilities.  The 
fact  that  this  process  may  be  repeated  in  any  order,  until  all  the  choice 
probabilities  are  positive,  is  guaranteed  by  (i)  and  (ii). 

A. 3  Stochastic  Choice  Model 

Here,  we  prove  that  if  the  choice  axiom  holds,  all  the  choice  pro- 


118 


babilities  are  determined  by  the  pairwise  probabilities.  In  the  fol¬ 
lowing,  we  assume,  without  loss  of  generality,  that  the  choice  probabili¬ 
ties  are  positive. 

Theorem  1 

If  for  all  x  e  T,  P^Cx)  4  0  and  if  the  choice  axiom  holds  for  all  x 


,  such 

that  x  c 

ACT, 

then 

(i) 

P(x:y) 

PT(x) 

PA 

p(y:x) 

'  PT(y) 

“PA 

and 

(ii) 

PT(x)  = 

h 

£ 

y£T-{: 

-1 


(A.  3) 


(A.  4) 


Proof 


(i)  By  the  choice  axiom 


PT(x>  =  p{x,y}W  *  pT<{x’y}) 

-  P(x:y)  fPT(x)  +  PT(y) j 

SO 

PT(x)  [l  -  P(x:y) ]  =  P(x:y)PT(y) 
Noting  that  P(y:x)  =  1  -  P(x:y),  we  have 


P(y:x) 


PT(x) 

PT(V) 


The  result  can  be  extended  to  include  any  subset  ACT  that  contains  the 
alternatives  x  and  y.  The  condition  in  Eq  (A. 3)  states  that  the  odds  of 
x  being  chosen  over  y  from  any  set  containing  them  equals  the  odds  of  a 
binary  choice  of  x  over  y.  This,  rather  important,  consequence  of  choice 
axiom  Is  variously  referred  to  as  the  independence  from  irrelevant  alter¬ 
natives  in  economics  and  Clarke's  constant  Ratio  Rule  in  psychology,  since 
it  was  independently  proposed  by  Clarke  [34]* 
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(ii) 

1  + 


To  prove  part  (11),  consider  the  term 

EP(y:x)  n  PT(x)  V'  PT(y} 

P(x:y)  PT(x)  LmJ  PT(x) 

y£T-{x)  yeT-{x} 


°  0^7  2  pt(i,) 

1  yeT 

1 

"  PT(x) 

The  required  result  immediately  follows.  Eq  (A. 4)  is  similar  to  Eqs. 
(2.30),  (2.38)  and  (2.39).  Other  consequences  of  the  choice  axiom,  such 
as  stochastic  transitivity  and  the  existence  of  a  ratio  scale,  may  be 
found  in  [24-27]. 

A. 4  Discussion 

Luce’s  choice  axiom  provides  a  powerful  means  to  construct  a  ration- 
al,  probabilistic  theory  of  individual  choice  behavior.  The  empirical 
evidence  [27]  suggests  that  it  works  very  well  in  some  situations,  and 
not  so  well  in  others.  Here,  we  summarize  the  advantages  and  limita¬ 
tions  of  the  axiom,  and  indicate  decision  situations  where  it  can  pro¬ 
fitably  be  applied. 

The  primary  advantages  of  the  choice  axiom  are  that  it  allows  for 
easy  computation  of  choice  probabilities  via  pairwise  comparisons,  and 
that  it  provides  a  simple  means  to  add  new  alternatives  or  subtract  from 
existing  ones.  The  latter  also  points  to  a  weakness  in  the  axiom  in 
that  the  independence  of  irrelevant  alternatives  is  implausible  in 
situations  where  some  of  the  alternatives  are  similar.  This  is  exem¬ 
plified  by  the  often  cited  objection  of  Debreu  [35]  to  the  choice 
axiom.  Suppose,  we  are  choosing  among  a  pony  (x) ,  a  bicycle  (y)  and 
another  bicycle  (z).  That  is,  T  *  {x,y,z}.  Assume  that  all  pairwi9e 


choice  probabilities  equal  Since  y  and  z  are  duplicates  of  each 
other,  one  expects  that  PT(x)  *  -j,  while  P^Cy)  “  PT(z)  ■  Data 
supports  this  intuition.  However,  if  choice  axiom  is  assumed  to  hold, 
then  all  trinary  choice  probabilities  equal  -j.  This  example  shows  that 
two  alternatives  (x  and  y) ,  which  are  equivalent  in  one  context  (i.e., 
P(x:y)  *  are  not  equivalent  in  another  context  (i.e.,  PT(x)i*PT(y) ) , 
contrary  to  independence  of  irrelevant  alternatives.  Thus,  the  applica¬ 
tion  of  choice  axiom  should  be  restricted  to  situations  where  the  alter¬ 
natives  can  be  assumed  to  be  distinct  and  independent,  such  as  those  in 
the  present  work. 


i 


f 

I 

I 

I 

I 


I 

I 


BIBLIOGRAPHY 


1-  W.  Edwards,  "Bayesian  and  regression  models  of  human  Information 
processing  -  a  myopic  perspective,"  Organizational  Behavior  and 
Human  Performance,  Vol.  6,  No,  6,  Nov.  1971,  pp.  639-648. 

2.  K.  R.  Pattipati  and  D.  L.  Kleinman,  "A  Survey  of  the  theories  of 
individual  choice  behavior,"  Univ.  of  Conn.  Tech.  Report,  EECS 
TR-79-12,  August  1979. 

3.  R.  A.  Howard,  "Social  decision  analysis,"  Proc.  IEEE,  Vol.  63,  No.  3, 
March  1975,  pp.  359-371. 

4.  G.  L.  Nerahauser,  Introduction  to  Dynamic  Programming,  New  York: 

Wiley,  1967. 

5.  W.  Edwards,  "The  Prediction  of  Decisions  among  bets,"  Journal  of 
Experimental  Psychology,  Vol.  50,  1955,  pp.  201-214. 

6.  T.  B.  Sheridan,  "Optimum  allocation  of  personal  presence,"  IEEE  Trans. 
on  Systems  Science  and  Cybernetics,  Vol.  SSC-6,  No.  3,  July  1970, 

pp.  242-244. 

7.  W.  B.  Rouse  and  J.  S.  Greenstein,  "A  model  of  human  decision-making 
in  multi-task  situtations:  implications  for  computer  aiding,"  Proc. 
of  the  6th  Int.  Conf.  on  Cybernetics  and  Society,  Washington  D.  C., 
Sept.  1976. 

8.  M.  K.  Tulga,  "Dynamic  decision-making  in  a  multi-task  supervisory 
control:  Comparison  of  an  optimal  algorithm  to  human  behavior," 

Se.D.  dissertation,  M.I.T.,  Sept,  1978. 

9.  G.  A.  Miller,  "The  Magical  number  seven,  plus  or  minus  two,"  The 
Psychological  Review,  Vol.  63,  No.  2,  March  1956,  pp.  81-97. 

10.  K.  R.  Pattipati,  A.  R.  Ephrath  and  D.  L.  Kleinman,  "Analysis  of  human 
decision-making  in  multi-task  environments,"  Univ.  of  Conn.  Tech. 
Report,  EECS  TR-79-15,  Nov.  1979. 

11.  K.  R.  Pattipati  and  D.  L.  Kleinman,  "Application  of  Dynamic  Program¬ 
ming  to  priority  Assignment  in  a  class  of  Queueing  Systems  with 
impatient  customers,"  to  appear  in  the  Proc.  of  the  IEEE  Conf.  on 
Decision  and  Control,  Albuquerque,  NM,  Dec.  1980.  Also  submitted 
to  IEEE  Trans,  on  Automatic  Control  for  publication. 

12.  G.  B.  Becker  and  C.  B.  McClintock,  "Value:  Behavioral  decision 
theory,"  Annual  Review  of  Psychology,  Vol.  18,  1967,  pp.  239-286. 


121 


122 


13.  R.  M.  Hogarth,  "Cognitive  processes  and  the  assessment  of  subjective 
probability  distributions, 11  Journal  of  the  American  Statistical 
Association,  Vol.  70,  No.  350,  June  1975,  pp.  271-289. 

14.  D.  L,  Kleinman,  S.  Baron  and  W.  H.  Levison,  "A  Control  theoretic 
approach  to  manned-vehicle  systems  analysis,"  IEEE  Trans,  on  Automat¬ 
ic  Control,  Vol.  AC-16,  No.  6,  Dec.  1971,  pp.  824-832. 

15.  R.  E.  Curry,  D.  L.  Kleinman  and  W.  C.  Hoffman,  "A  design  procedure 
for  control/display  systems,"  Human  Factors,  Vol.  19,  No.  5,  Oct, 

1977,  pp.  421-436. 

16.  D.  L.  Kleinman,  "Solving  the  optimal  attention  allocation  problem  in 
manual  control,"  IEEE  Trans,  on  Automatic  Control,  Vol.  AC-21,  No.  6, 
Dec.  1976,  pp.  813-822. 

17.  W.  H.  Levison  and  R.  B.  Tanner,  "A  Control  theory  model  for  Human 
decision  making,"  NASA  CR-1953,  Dec.  1971. 

18.  E.  G.  Gai  and  R.  E.  Curry,  "A  Model  of  the  human  observer  in  failure 
detection  tasks,"  IEEE  Trans,  on  Systems,  Man,  and  Cybernetics,  Vol. 
SMC-6 ,  No.  2,  Feb.  1976,  pp.  85-94. 

19.  A.  V.  Oppenheim,  and  R.  W.  Schafer,  Digital  Signal  Processing,  Engle¬ 
wood  CLiffs,  NJ:  Prentice-Hall,  1975. 

20.  I.  Kanter,  "The  ratios  of  functions  of  random  variables,"  IEEE  Trans, 
on  Aerospace  and  Electronic  Systems,  Vol.  13,  No.  6,  Nov,  1977, 

pp.  624-630. 

21.  R.  A.  Howard,  Dynamic  Probabilistic  Systems,  Vol.  II,  New  York:  Wiley, 

1971.  ~  ’ 

22.  CRC  Handbook  of  Tables  for  Mathematics,  Cleveland,  Ohio.  The  Chemical 
Rubber  Company,  1970. 

23.  N.  M.  Steen,  G.  D.  Byrne  and  E.  M.  Gelbard,  "Gaussian  quadratures  for 

OO  B 

r  2  f  -x2 

the  integrals  I  e  x  f(x)dx  and  I  e  X  f(x)dx,"  Mathematics  of  Compu- 
o  o 

tation,  Vol.  23,  No.  107,  July  1969,  pp.  661-667. 

24.  R.  D.  Luce  and  E.  Galanter,  "Discrimination,"  in  R,  D.  Luce,  R.  R. 

Bush  and  E,  Galanter  (Eds.),  Handbook  of  Mathematical  Psychology, 

Vol.  I,  New  York:  Wiley,  1963. 

25.  R.  D.  Luce  and  P.  Suppes,  "Preference,  utility  and  subjective  proba~ 
billty,"  in  R.  D.  Luce,  R.  R.  Bush  and  E.  Galanter  (Eds.),  Handbook 
of  Mathematical  Psychology,  Vol.  Ill,  New  York:  Wiley,  1965. 

26.  R.  D.  Luce,  Individual  Choice  Behavior:  a  Theoretical  Analysis, 

New  York:  Wiley,  1959. 


123 


27.  R.  D.  Luce,  "The  Choice  Axiom  after  Twenty  Years,"  Journal  of 
Mathematical  Psychology,  Vol.  15,  1977,  pp.  215-233. 

28.  K.  R.  Baker,  Introduction  to  Sequencing  and  Scheduling,  New  York: 
Wiley,  1974. 

29.  R.  A.  Howard,  "The  Foundations  of  Decision  Analysis,"  IEEE  Trans*  on 
Systems  Science  and  Cybernetics,  Vol.  SSC-4,  No.  3,  Sept.  1968, 

pp.  211-219. 

30.  D.  L.  Kleinman,  "Monte  Carlo  simulation  of  human  operator  response," 
Univ.  of  Conn.  Tech.  Report,  EECS  -  TR-77-1,  Feb.  1977. 

31.  L.  L.  Thurstone,  "A  Law  of  comparative  judgement,"  Psychological 
Review,  Vol.  34,  1927,  pp.  273-286. 

32.  H.  D.  Block  and  J.  Marschak,  "Random  orderings  and  stochastic 
theories  of  responses,"  in  I.  Olkin,  S.  Ghurye,  W.  Hoeffding, 

W.  Madow  and  H.  Mann  (Eds.),  Contributions  to  probability  and 
statistics,  Stanford:  Stanford  Univ.  Press,  1960. 

33.  K.  J.  Arrow,  Social  Choice  and  Individual  Values,  New  York:  Wiley, 
1964. 

34.  F.  R,  Clarke,  "Constant-ratio  Rule  for  confusion  Matrices  in  speech 
communication,"  Journal  of  Acoustical  Society  of  America,  Vol.  29, 
1957,  pp,  715-720. 

35.  0.  Debreu,  "Review  of  R.  D.  Luce,  Individual  Choice  Behavior:  a 
Theoretical  Analysis,"  American  Economic  Review,  Vol.  50,  1960, 

pp.  186-188. 


